“TypeError:必须在散列之前对Unicode对象进行编码”

我有这个错误

Traceback (most recent call last): File "python_md5_cracker.py", line 27, in <module> m.update(line) TypeError: Unicode-objects must be encoded before hashing 

当我尝试在Python 3.2.2中执行这个代码时:

 import hashlib, sys m = hashlib.md5() hash = "" hash_file = input("What is the file name in which the hash resides? ") wordlist = input("What is your wordlist? (Enter the file name) ") try: hashdocument = open(hash_file,"r") except IOError: print("Invalid file.") raw_input() sys.exit() else: hash = hashdocument.readline() hash = hash.replace("\n","") try: wordlistfile = open(wordlist,"r") except IOError: print("Invalid file.") raw_input() sys.exit() else: pass for line in wordlistfile: m = hashlib.md5() #flush the buffer (this caused a massive problem when placed at the beginning of the script, because the buffer kept getting overwritten, thus comparing incorrect hashes) line = line.replace("\n","") m.update(line) word_hash = m.hexdigest() if word_hash==hash: print("Collision! The word corresponding to the given hash is", line) input() sys.exit() print("The hash given does not correspond to any supplied word in the wordlist.") input() sys.exit() 

它可能是从wordlistfile寻找一个字符编码。

 wordlistfile = open(wordlist,"r",encoding='utf-8') 

或者,如果您正在逐行工作:

 line.encode('utf-8') 

你必须定义像utf-8这样的encoding format ,试试这个简单的方法,

这个例子使用SHA256algorithm产生一个随机数:

 >>> import hashlib >>> hashlib.sha256(str(random.getrandbits(256)).encode('utf-8')).hexdigest() 'cd183a211ed2434eac4f31b317c573c50e6c24e3a28b82ddcb0bf8bedf387a9f' 

该错误已经说明你必须做什么。 MD5对字节进行操作,因此必须将Unicodestring编码为bytes ,例如使用line.encode('utf-8')

请先看看答案。

现在,错误消息是明确的:只能使用字节,而不是Pythonstring(以前在Python中是unicode <3),所以你必须用你喜欢的编码来编码string: utf-32utf-16utf-8甚至是其中一个受限的8位编码(有些可能会调用代码页)。

从文件读取时,wordlist文件中的字节将被Python 3自动解码为Unicode。 我build议你这样做:

 m.update(line.encode(wordlistfile.encoding)) 

这样推送到md5algorithm的编码数据就像底层文件一样被编码。

你可以用二进制模式打开文件:

 import hashlib with open(hash_file) as file: control_hash = file.readline().rstrip("\n") wordlistfile = open(wordlist, "rb") # ... for line in wordlistfile: if hashlib.md5(line.rstrip(b'\n\r')).hexdigest() == control_hash: # collision 

要存储密码(PY3):

 import hashlib, os password_salt = os.urandom(32).hex() password = '12345' hash = hashlib.sha512() hash.update(('%s%s' % (password_salt, password)).encode('utf-8')) password_hash = hash.hexdigest()