Entropy code

zeynepaydogan zeynepaydogan at proton.me
Thu May 5 19:13:35 PDT 2022


int main()
{
const auto lowEntropyArrayPointer = reduce_entropy(payload); // Process the original high entropy array and get a pointer to the reduced entropy processed array
BYTE original_hi_entropy_payload[sizeof payload * 2 - remaining_bytes] = {0}; // copy the resulting array to a newly created array for entropy calculation purposes
memcpy_s(original_hi_entropy_payload, sizeof original_hi_entropy_payload, lowEntropyArrayPointer, sizeof payload * 2 - remaining_bytes);
const auto first_array = calculate_entropy(reinterpret_cast<char*>(original_hi_entropy_payload), sizeof original_hi_entropy_payload); // Calculate entropy after processing
const auto restored_payload = restore_original(original_hi_entropy_payload); // Restore original array
BYTE restored_low_entropy_payload[(payload_size_after_entropy_reduction + 1) / 2] = {0};
memcpy_s(restored_low_entropy_payload, (payload_size_after_entropy_reduction + 1) / 2, restored_payload, (payload_size_after_entropy_reduction + 1) / 2);
const auto second_array = calculate_entropy(reinterpret_cast<char*>(restored_low_entropy_payload), sizeof restored_low_entropy_payload); // Calculate restored array entropy
const auto original_array_entropy = calculate_entropy(reinterpret_cast<char*>(payload), sizeof payload); // Calculate entropy of the original unprocessed sample
printf("\r\Original array Entropy is: %f\r\n", original_array_entropy); // Present results to console.
printf("Processed array Entropy is: %f\r\n", first_array);
printf("\r\Restored array Entropy is: %f\r\n", second_array);
getchar();
}

By compiling the source code provided, you get the following results printed to the console:

Original array Entropy is: 4.825164
Processed array Entropy is: 3.451938
Restored array Entropy is: 4.825164

I tested with a very large, random array and entropy
was reduce from more than 7 to less than 4. In the example because the original array is small, reduction is modest, but we can still see it and probe the point.
It is also important to realize that all calculations are done based on the original array and then on the converted one, which means that you don't need to know the original array
characteristics (like original length) to be able to restore it successfully because the algorithm employed for conversion is completely reversible, however if you would
want to use random sized low entropy chunk patterns then you'd need to store somewhere not only the converted array but also the data needed for complete restoration. Although
the code would be more complex, results surely be better in terms of avoiding signatures creation against it.

Reducing entropy of obfuscated malware code is simple; it can be used to evade detection and on top of that it might confer some additional protection against signatures creation.
The presented code can be adapted to create solutions that could help prevent the use of entropy as a mean of detecting malware. Creating better low entropy byte patterns using different
mathematical equations and different sized low entropy pieces of code will maybe increase the robustness of the method.

I would like to wish everyone to become more educated, smarter and ethical

@zeynep
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 4280 bytes
Desc: not available
URL: <https://lists.cpunks.org/pipermail/cypherpunks/attachments/20220506/558ca18f/attachment.txt>


More information about the cypherpunks mailing list