Datastore symbolic links

From OLPC
Jump to: navigation, search

Creating a folder of symbolic links to your DataStore data

The dsget.py script queries the datastore using a D-bus query and writes the returned data to a flat file in the /home/olpc directory. Then the dslinks.py script deletes any old symbolic links and creates a new set under the /home/olpc/datalinks/ folder. The structure underneath ./datalinks is <XO username>/<activity name>/<datastore unique path>.<extension based upon the mime type of the file in question>

It would have been nice to combine dsget.py and dslinks.py in a single script. But the reading of the datastore uses up memory, and python is slow in returning memory to the system. It seems necessary to do one and then the other to avoid Out Of Memory errors. The two files need to be created in the same directory.

Usage: dslinks.py executes dsget.py as a subprocess. So the only requirement is that dsget.py must be in the same directory as dslinks.py. Then the command required to generate the folder of symbolic links under /home/olpc/datalinks/ is:

[olpc@xo olpc]$./dslinks.py

At the moment, the only delivery method possible is to cut and paste the following scripts into an editor, and then save the file onto your XO. This is cumbersome, and requires that you use 'chmod 755 <scripts>' to make them executable. I'll try to find a better solution.

Following is the dsget.py script

#!/usr/bin/env python
#filename: dsget.py
from sugar.datastore import datastore
from sugar import profile                                 
from sugar import util
import sys
import os
import subprocess
#change the following to False if duplicate XO names exist in your group
names_are_unique = True
linkbase = '/home/olpc'
fname = linkbase + '/datastorelist.txt'

        
#following find returns few DSobjects -- for debug
#(results,count)=datastore.find(dict(title='etoys'))

#following find returns all journal items
(results,count)=datastore.find(dict())

print 'number of found items:',count
pactivity = ''
fd = open(fname,'w')
for f in results: #f iterates (sequences) over results --  a list of DSojects
    src = f.get_file_path() #returns the full path of the file
    if src == '': #if null, there is no rile related to this meta data
        f.destroy()
        continue	#go get the next iterations
    info = os.stat(src) #get the directory information about this file
    datastoredict=f.get_metadata().get_dictionary() #get the property dictionary
    pactivity = datastoredict['activity']
    tablename = pactivity.split('.')[-1:][0]
    keys = datastoredict.keys()
    object_id = f.object_id
    #print the data for the keys that are interesting
    fd.write( "%s|%s|%s|%s|%s"%(
                tablename,
                info.st_size,
                datastoredict['mime_type'],
                profile.get_nick_name(),
                src,))
    for k in keys:
       if k == 'preview':continue
       if k == 'mime_type':continue
       fd.write ('|%s'%(datastoredict[k],)) 
    fd.write('\n')
    f.destroy()
fd.close()

And this creates the links:

#!/usr/bin/env python
#filename: dslinks.py
from sugar.datastore import datastore
from sugar import profile                                 
from sugar import util
import sys
import os
import subprocess
#change the following to False if duplicate XO names exist in your group
names_are_unique = True
linkbase = '/home/olpc'
fname = linkbase + '/datastorelist.txt'

proc = subprocess.Popen('./dsget.py', 
                        shell=True, 
                        stdout=subprocess.PIPE,
                        )
stdout_value = proc.communicate()[0]
print str(stdout_value)

extensions = {'image/png':'png','application/pdf':'pdf','application/x-tar':'tar','text/plain':'txt',
              'image/jpeg':'jpg','video/x-theora+ogg':'ogg','application/vnd.olpc-sugar':'xo',
              'application/x-gzip':'gz','video/ogg':'','application/vnd.oasis.opendocument.text':'odt',
              'text/x-python':'py','application/x-squeak-project':'pr'}

#create a guaranteed unique user id
key = profile.get_pubkey()
# If you want a shorter key, you can hash that like:
key_hash = util._sha_data(key)
hashed_key = ''
if not names_are_unique:    
    hashed_key = '_'+util.printable_hash(key_hash)

#delete any links that currently exist
deletecmd = '/bin/rm -rf '+linkbase + '/datalinks/*'
os.system(deletecmd)

for flatfileline in open(fname,'r'):
    inline = flatfileline.split("|")
    if len(inline)<6:continue
    user = inline[3]+hashed_key
    activity = inline[0]
    tablename = activity.split('.')[-1:][0]
    if tablename == '':
        tablename = 'noactivity'
    ext = extensions.get(inline[2],'')
    if ext != '': ext = '.' + ext
    #guess the actual filename from the name of the copy made by the journal code
    #get the index of the prefix to the filename
    i = inline[4].find('/data/')
    if i == -1:
        print 'filename prefix not found: '+str(inline[4])
        continue  #abort processing
    fakename = inline[4][i+6:]
    #print 'fakename:',fakename
    dsfilename = fakename.split('(')
    dsfilename = dsfilename[0]
    dsfilename = dsfilename.split('.')
    dsfilename = dsfilename[0]
    symbolicbase =  '/'.join((linkbase,'datalinks',user,tablename,))
    symboliclink = symbolicbase + '/' + dsfilename + ext
    target = inline[4][:i]+'/datastore/store/'+ dsfilename
    fd = open(target,'r')
    if not fd:continue
    fd.close()
    #make sure the directory exists where we want to write the link
    if not os.path.isdir(symbolicbase):
        os.makedirs(symbolicbase)
    linkcommand = '/bin/ln -s '+target + ' ' + symboliclink 
    print linkcommand
    
    try:
        retcode = subprocess.call(linkcommand,shell=True)
        if retcode < 0:
            print >>sys.stderr, "Child was terminated by signal", -retcode
        else:
            print >>sys.stderr, "Child returned", retcode
    except OSError, e:
        print >>sys.stderr, "Execution failed:", e
    
    #err = os.system(linkcommand)
    #if err != 0:
        #print 'os.system error number: ',err

A script to simplify selection of arbitrary data and destinations

Just hit enter/return twice to execute the default patterns and destinations. Change the defaults by changing the "if <True> then" -- action statements.

#!/bin/bash
echo -n 'what pattern in the datalinks tree should be matched?'
read -e PATTERN
echo -n 'And what destination PATH should be copied to?'
read -e DST
if  [$DST = '']
then
     DST="/media/KINGSTON" #change this to suit your typical destination
fi
if  [$PATTERN = '']
then
     PATTERN=".odt"  #change this for the grep patterns you typically require
fi
#echo -n "copying to $DST using pattern $PATTERN"
#set -x   #delete the first '#' to generate a debug listing on your terminal screen
for FILEPATH in `find /home/olpc/datalinks | grep $PATTERN`
do
   PREFIX=`echo "$FILEPATH" | sed -e 's:^/home/olpc/datalinks/::' -e 's:/[a-f0-9a-z\.-]*$::'`
  mkdir -p $DST/$PREFIX
  cp -p $FILEPATH $DST/$PREFIX
done
echo -n 'done\n'