20 June 2013

459. Briefly: Proxies, browsing and paranoia

It's easy to configure Chrome to use Tor to preserve a semblance of privacy online (http://verahill.blogspot.com.au/2013/06/450-tor-and-chrome-on-debian.html). There are a few, simple things you can do to make your life with a proxy easier to manage.

This post presumes that you've followed this post first: http://verahill.blogspot.com.au/2013/06/450-tor-and-chrome-on-debian.html. In particular, that you have turned off pre-fetching.

In addition, you may want to think about the following:

Incognito mode
On the lower end of the scale, you may or may not want to use incognito mode consistently. This has little bearing on privacy online, but it depends on whether you want to leave traces on your computer of your browsing history. Although that should only be an issue if someone gets physical access to your computer, you never know if the next browser bug will give someone complete access to your history. Most likely it'll only provide metadata (which is what the NSA brouhaha has been mostly about).

Anyway, if you feel this is an important issue then you should probably be encrypting your disks with encfs as well.

Search engine
It's probably more important to rethink how you are using search engines in Chrome. First of all, you should turn off instant search. Secondly, you will want to consider whether you want to use google as the default search engine for queries in the URL field. Two main search engines come to mind: duckduckgo.com, and startpage.com. While duckduckgo.com has a higher profile, startpage.com is a bit more full-featured, and that's because it takes your query, anonymizes it, and passes it on to google. It's also based in Europe, which I (probably naively) feel is safer.

Go to startpage.com, and click on 'add to chrome' under the search box. Then set Startpage HTTPS as the default in Chrome:


Also consider making sure that google.com isn't your home page in chrome.

Proxy
Even though Tor works fine in general, it can be a bit slow, and you don't want to use it for everything anyway. There are times when you don't want to use a proxy. In my case, that's when I visit journal websites or my university websites. Also, I have set up a reverse proxy via my home router, and it's faster than Tor, so for a lot of things I'm fine with using that.

Switch ProxySharp supports the creation of rule-based proxy switching. In my case, I've set it so that if I use google, I use Tor. If I go to RSC, ACS, Wiley or Elsevier journals, I use my university connection, and for everything else, I use my home router.



You then just need to click your way through to the proxyswitcher alternative:
The icon will change colour depending on which proxy is active. Pretty neat!

19 June 2013

458. Briefly: Converting GRAMS ASP ascii data to two-column ascii data

We have a couple of CARY 630 FT-IR /ATR instruments.

I hate them. Apart from being the Mac equivalent of spectrometers (if you try to do anything remotely creative you'll have a bad day. Point and click works well, most of the time), they aren't able to output data in any reasonable format.

At least not the way I'd define 'reasonable' i.e. simple x-y ascii data file and/or JCAMP-DX and/or even .csv. The default output is a binary .a2r file.

The only ascii-type format is a proprietary GRAMS ASP ascii file, for which I haven't been able to get the formal specs. Using google it seems as if the German arm of agilent did publish it, but when clicking on the links I'm told the file no longer exists, and google cache isn't playing ball.

Anyway. Luckily the format seems pretty simple.

Here are the first ten lines of an .asp file;
1798 4000.41016197344 650.579285428114 1 128 4 98.4862110783457 98.4183476284596 98.4587565715995 98.5660576694946
* The first line is the number of acquired data points
* The second line is the highest reciprocal wavelength in cm-1.
* The third line is the lowest reciprocal wavelength in cm_1.
* I don't know what the fourth and fifth lines signify. It could be dynamic resolution in the Y axis.
* The sixth line is the native resolution, i.e. 4 cm-1/data point. However, the data seems to be zero-filled, i.e. it seems the resolution is really ca 1.86 cm-1/pt.
Knowing the above, we can write a simple python script, which we'll call asp2asc, which will allow us to generate files suitable for gnuplot.
Example usage:
./asp2asc -i data.asp -o data.dat


asp2asc:
#!/usr/bin/python
#converts GRAMS ascii (asp) output from an CARY 630 FT-ATR-IR to a two-column ascii dat file
import sys

def getvars(arguments):
 exit=0
 ver=0.1
 try: 
  if "-o" in arguments:
   theoutput=arguments[arguments.index('-o')+1]
   print 'Output: %s.'%theoutput
  elif "--output" in arguments:
   theoutput=arguments[arguments.index('--output')+1]
   print 'Output: %s.'%theoutput
  else:
   print ''
   print 'Error -- no output file defined.'
   print ''
   arguments="--help"
 except:
  arguments="--help"

 try: 
  if "-i" in arguments:
   theinput=arguments[arguments.index('-i')+1]
   print 'Input: %s.'%theinput
  elif "--input" in arguments:
   theinput=arguments[arguments.index('--input')+1]
   print 'Input: %s.'%theinput
  else:
   print ''
   print 'Error -- no input file defined.'
   print ''
   arguments="--help"
 except:
  arguments="--help"

 try:
  if ("-h" in arguments) or ("--help" in arguments):
   print " "
   print "\t\tThis is asp2asc, a tool for generating converting"
   print "\t\tGRAMS ASP ascii files to two-column ascii files"
   print "\t\tThis is version",ver
   print "\tUsage:"
   print "\t-h\t--help   \tYou're looking at it."
   print "\t-i\t--input \tInput file, e.g. data.asp"
   print "\t-o\t--output \tOutput file, e.g. data.dat"
   print ""
   exit=1
 except:
  a=1   #do nothing
 
 if exit==1:
  sys.exit(0)
 print ''

 switches={'i':theinput,'o':theoutput}
 return switches

def getparams(datafile):
 params=[]
 n=1
 for line in datafile:
  try:
   params+=[int(line.rstrip('\n'))] 
  except:
   params+=[float(line.rstrip('\n'))] 
  if n==6:
   break
  n+=1 
 return params
 
def getydata(datafile):
 ydata=[]
 for line in datafile:
  ydata+=[float(line.rstrip('\n'))]
  
 return ydata
 
 
def makexdata(xpts,xmax,increment):
 n=0
 xdata=[]
 while n < xpts:
  xdata+=[xmax-n*increment]
  n+=1
 return xdata

def writexydata(outfile,xdata,ydata):
 for n in range(0,len(xdata)):
  outfile.write(str(xdata[n])+'\t'+str(ydata[n])+'\n')
 return 0

if __name__ == "__main__":
 arguments=sys.argv[1:len(sys.argv)]

 switches=getvars(arguments)
 infile=open(switches['i'],'r')
 
 params=getparams(infile) 
 ydata=getydata(infile) # needs getparams to have parked file reading at the 7th line 

 infile.close()

 xdata=makexdata(params[0],params[1],(params[1]-params[2])/(params[0]-1))

 if len(xdata)==len(ydata):
  outfile=open(switches['o'],'w')
  success=writexydata(outfile,xdata,ydata)
  outfile.close()  
 else:
  print 'Something bad happened:'
  print 'Number of X data points not equal to number of Y data points'
  print 'x pts: %i, y pts: %i'%(len(xdata),len(ydata))

Of course you could do this easily in a spreadsheet too, but I honestly find myself avoiding spreadsheet programmes like the plague ever since I learned how to use sed, gawk, and python.
Also, WHY do they make it so unnecessarily difficult to export your own data?

457. Very Briefly: Microsoft has a Tor exit node?

Whenever I play around with Tor I use ipchicken.com or whatsmyip.org to make sure that I'm indeed using a proxy. I also normally do a whois on the IP address, so see who's running the exit node.

Today I ended up with the IP address 168.61.8.22.

whois 168.61.8.22
NetRange: 168.61.0.0 - 168.63.255.255 CIDR: 168.62.0.0/15, 168.61.0.0/16 OriginAS: NetName: MSFT-EP NetHandle: NET-168-61-0-0-1 Parent: NET-168-0-0-0-0 NetType: Direct Assignment RegDate: 2011-06-22 Updated: 2012-10-16 Ref: http://whois.arin.net/rest/net/NET-168-61-0-0-1 OrgName: Microsoft Corp OrgId: MSFT-Z Address: One Microsoft Way City: Redmond StateProv: WA PostalCode: 98052 Country: US RegDate: 2011-06-22 Updated: 2013-04-12 Ref: http://whois.arin.net/rest/org/MSFT-Z OrgTechHandle: MSFTP-ARIN OrgTechName: MSFT-POC OrgTechPhone: +1-425-882-8080 OrgTechEmail: iprrms@microsoft.com OrgTechRef: http://whois.arin.net/rest/poc/MSFTP-ARIN OrgAbuseHandle: HOTMA-ARIN OrgAbuseName: Hotmail Abuse OrgAbusePhone: +1-425-882-8080 OrgAbuseEmail: abuse@hotmail.com OrgAbuseRef: http://whois.arin.net/rest/poc/HOTMA-ARIN OrgAbuseHandle: MSNAB-ARIN OrgAbuseName: MSN ABUSE OrgAbusePhone: +1-425-882-8080 OrgAbuseEmail: abuse@msn.com OrgAbuseRef: http://whois.arin.net/rest/poc/MSNAB-ARIN OrgNOCHandle: ZM23-ARIN OrgNOCName: Microsoft Corporation OrgNOCPhone: +1-425-882-8080 OrgNOCEmail: noc@microsoft.com OrgNOCRef: http://whois.arin.net/rest/poc/ZM23-ARIN OrgAbuseHandle: ABUSE231-ARIN OrgAbuseName: Abuse OrgAbusePhone: +1-425-882-8080 OrgAbuseEmail: abuse@microsoft.com OrgAbuseRef: http://whois.arin.net/rest/poc/ABUSE231-ARIN
That Microsoft is listed as the organisation doesn't necessarily mean that they are running the node (could be a hosting company) but it still seems that this might actually be MS running this one. Maybe it's just for research purposes, but it still seemed a bit surprising.

Microsoft as a company isn't exactly known for doing things out of the goodness of their hearts. Oh well.