xxHash is an Extremely fast Hash algorithm, processing at RAM speed limits
100x181 documents
MD5:start = timeit.default_timer() for i in tqdm(range(100)): total = 0 for path in paths: if not path.is_file(): continue hash = hashlib.md5(path.read_bytes()).hexdigest() total += 1 print(total) end = timeit.default_timer() print(f"Time: {end - start}")
181 Time: 10.780842124004266xxhash:
start = timeit.default_timer() for i in tqdm(range(100)): total = 0 for path in paths: if not path.is_file(): continue hash = xxhash.xx64(path.read_bytes()).hexdigest() total += 1 print(total) end = timeit.default_timer() print(f"Time: {end - start}")
181 Time: 10.775027380004758
Not significat faster. I guess most of the time is spent in IO. I go with MD5 because its more common, familiar for others and implemented everywhere.
No comments:
Post a Comment