Juggling Modules For Numpy 2.0 to 1.3 Pickle Conversion
Occasionally one has to jump through a lot of hoops and do some pretty bizzare and unorthadox things when coding. Especially when working with multiple versions of libraries. At work all of our iMacs were recently upgraded to Snow Leopard and Python 2.6 and libraries isntalled (all in 64-bit). Its nice to upgrade . . . when it doesn't break things. Unfortunately during the upgrade numpy 1.4 was installed. Now this was only unfortunate because the super computer we also need to run on has numpy 1.3 which won't unpickle objects pickled with numpy 1.4. The culprint was a new verison number on dtype which caused np1.3 (numpy) to bail. Evaluating my options I took the only logical one, convert a np1.4 pickle into a np1.3 pickle. How would I do this? By converting the stream of bytes being read from the 1.4 pickled file while unpickling! Afterall, understanding the binary pickle protocol and the numpy pickling protocol and then changing things on fly was the only logical approach. After some hacking I decied that converting from one file format to other and just saving the result was more useful since I would be unpickling thigns quite often. Some more hacking brought about this lovely function.
def convert_binary(ins, outs): state = 0 chars = '\x00\x00' while True: b = ins.read(1) c = b if b == '': break if state == 0 and b == '\x04' and chars == '(K': b = ins.read(1) ins.seek(-1, os.SEEK_CUR) if b == 'U': c = '\x03' state = 1 elif state == 1 and b == 'N' and chars == 'K\x00': b = ins.read(1) ins.seek(-1, os.SEEK_CUR) if b == 't': c = '' state = 0 chars = chars[1:] + b outs.write(c)
And it worked for my test cases. Sadly when I attempted to convert one of the actual files we needed the resulting file wasn't unpickleable. So a new solution was needed. Next I would try installing numpy 1.4 along side numpy 1.3 in a clean python 2.6 install. In the end I ended up with numpy 1.4 (which turned out to by 2.0 as I used the latest svn checkout) in sites-packages and numpy 1.3 in /tmp/site-packages. Now I would need to unpickle under numpy 2.0 "un-import" the numpy modules, import numpy 1.3 and re-pickle. The following code juggles the modules during the conversion process. I forgot to mention it did require making a small change to pickle.py to make it ignore the fact that the numpy objects in the pickle were technically different than the module in the global namespace.
import sys; sys.path.append('./') import os import pickle pickle.ignore_module_mismatch = True sys_modules20 = None sys_modules13 = None def main(): global sys_modules20, sys_modules13 tmp = list(sys.path) import numpy as numpy20 # remove the 2.0 from the main list sys_modules20 = {} for k in sys.modules.keys(): if k.startswith('numpy'): sys_modules20[k] = sys.modules.pop(k) sys.path.insert(0, '/tmp/site-packages') import numpy as numpy13 # copy the list with numpy 1.3 sys_modules13 = {} for k in sys.modules.keys(): if k.startswith('numpy'): sys_modules13[k] = sys.modules.pop(k) for infile in sys.argv[1:]: # backup the pickle file before conversion newinfile = infile + '.bck' print 'Moving %s to %s' % (infile, newinfile) os.rename(infile, newinfile) outfile = infile infile = newinfile convert(infile, outfile) def convert(infile, outfile): global sys_modules20, sys_modules13 sys.modules.update(sys_modules20) print '\tloading...' trees = pickle.load(open(infile, 'rb')) sys.modules.update(sys_modules13) print '\tsaving...' pickle.dump(trees, open(outfile, 'wb')) if __name__ == '__main__': main()
Add new comment