Non-negative Least Squares Fitting

Collecting Reference MS Files

A series of reference spectra are required if you want to do non-negative least squares (NNLS) fitting of your data. There are two example files in this repository to use as reference: “ref_spec.txt” and “ref_spec2.MSL”. The MSL file is a type of text file library that can be exported by programs like the AMDIS or the NIST Mass Spectral Database. The .txt file was hand generated; more information on this file format is provided in the next paragraph. The format of both of these files is very important; however, MSL files are typically autogenerated by external software and may not need manual modification. A common feature of both formats, though, is that comment lines can be included by starting a line with #. This can be useful if you want to add some notes or to remove reference spectra without deleting them entirely.

Hand generated reference files must be text files that end with the prefix ”.txt”. Each reference compound must have at the minimum two labels: “NAME” and “NUM PEAK”. “NAME” will be the reference name (probably want this to be concise), and “NUM PEAKS” is followed by (at least) two space-separated columns of MS data. The first column are m/z values, and the second column are the intensity values. Intensities are normalized on import; it is not necessary to do this by hand. Other labels can also be included if you would like to incorporate extra metadata about the reference compound. Each reference compound must be separated by a blank line. Below is a small sample of one of these files:

NAME:octane
FROM:www.massbank.jp
ID_NUM: JP004695
NUM PEAKS:
  42 14.07 141
  43 99.99 999
  44 2.54 25
  45 4.03 40
  53 1.58 16
  55 19.83 198
  .
  .
  .

The online MS repository massBank is a useful place to find these mass and intensity values. The data from that site is already formated correctly for this file type.

Loading Reference Spectra

In order to import this reference file, we will use the AIAFile object’s ref_build function. Below is a repeat of the steps necessary in IPython to set up our environment. (These are unchanged from the previous sections but are repeated for clarity.)

In: import matplotlib.pyplot as plt
In: import gcms
In: data = gcms.AIAFile('data/datasample1.CDF')

At this point, we are ready to read in our reference file using the function AIAFile.ref_build.

Table Of Contents

Previous topic

Basics of working with GCMS data files

Next topic

Automated Calibration and Integration

This Page