How to know if a file is decrypted or not

You really can't, if you're just encrypting / decrypting text.

If you know that the encrypted string is "kdo" and the encryption method is a Caesar shift, the plaintext could just as easily be "IBM" as "HAL". You'd have to have some idea of what the plaintext "looks like". For instance, if you know the plaintext is the name of a Stanley Kubrick character, you'd have a decent idea of which one it should be.

If you have a longer string, it's much easier to narrow things down. A large text file has much fewer intelligible results than the three-character example above. But you'll still need to determine whether it's decrypted yourself.

On the other hand, if you're decrypting an entire file in some specific format (.docx, etc), you can be reasonably sure the file is decrypted if the parsing program (Word, etc) can read it.

You absolutely can tell with varying degrees of certainty if a file, or even string, was successfully decrypted. Most of the challenges at cryptopals depend on it. I have begun to make a tool for ciphertext bruteforce and analysis that automates this very task. You can find it here if you want to take a look.
(it needs a lot of cleaning up, don't judge me)

My goal originally in this project was to improve my efficiency in CTF crypto challenges with a simple brute-force tool, but I'm starting to work on implementing a lot more analysis. As it stands, it can bruteforce all caesar, single-byte XOR, atbash, and a few encodings, with repeating-key XOR developed but not integrated yet.

The way it works now

takes input ciphertext string or file of newline-delimited ciphertext strings
attempts to decrypt with entire keyspace of supported ciphers
after every decrypt attempt, runs a detection function on cleartext to determine if the decrypted text is English
displays most likely guesses

The one thing that makes this process finicky is how the thresholds for English detection must be adjusted depending on ciphetext length. It defaults to requiring 60% of the cleartext to be words and 75% of the cleartext to be letters to register a match. This setting rarely gives false positives, and even less frequently false negatives, on medium to long length cleartexts (anything over a few strings). When used on short length ciphertexts however, some false positives will pop up and many false negatives will get by. In testing, I have had to lower the thresholds by 30% or more to detect a match on some short strings, and in the process generate many more false positives that I have to sift through to find the real match.

I strongly recommend working through the cryptopals challenges from the beginning if you are interesting in learning more about making oracles and breaking crypto. It starts easy and progresses into real-world attacks, like making a Bleichenbacher Oracle, part of what makes the DROWN attack work.

tl;dr

you need to make a module that detects English and apply it to the result of every decryption attempt. or just fork mine and make it better.
in cases where the cleartext is not going to be English or another language, some more advanced analysis is required.

If you have some idea what the cleartext is, you can use that knowledge guess when you've might have cracked the ciphertext.

If you think that the cleartext is english, for instance, start looking for english words in your decrypt attempt.

If you think the cleartext is a zip file, zip files have a signature at the beginning of the file. Look for that signature.

If you think the cleartext is an email, look for telltale email headers.

In general, you could try to look for the "information content" of the decrypt attempt. Plaintext normally has a lower information content than ciphertext, though this isn't true for a simple caesar cipher.

But you need to start with some inking of what the cleartext might contain, even if (as above) it's merely "a lower information content score than the ciphertext".

How to know if a file is decrypted or not

tl;dr

Tags:

Brute Force

Related

Recent Posts