Baseline correction

Description

This module performs baseline correction on raw data files. It is designed to compensate for gradual shifts in the chromatographic baseline by detecting the baseline and then subtracting it from the raw data intensity values. The module proceeds as follows for each raw data file passed to it:

  1. The full range of m/z values present in the raw data is divided into a series of bins of a specified width (see m/z bin width).
  2. For each bin a chromatogram is constructed from the raw data points whose m/z values fall within the bin. This chromatogram (see Chromatogram type) may be either the base peak chromatogram or total ion count (TIC) chromatogram.
  3. The raw intensity values of each data point in a bin are corrected by subtracting the bin's baseline. Subtraction of baseline intensity values proceeds according to the type of chromatogram used to determine the baseline.

    If the base peak chromatogram was used then the corrected intensity values are calculated as follows:
    Icorr = max(0, Iorig - Ibase)

    If the TIC chromatogram was used then the corrected intensity values are calculated as follows:
    Icorr = max(0, Iorig * (1 - Ibase / Imax))

    where Iorig, Ibase, Imax and Icorr are the original, baseline, maximum and corrected intensity values, respectively, for a given scan and m/z bin. If Ibase is less or equal to zero then no correction is performed, i.e. Icorr = Iorig.

  4. A new raw data file is generated from the corrected intensity values.

Method Parameters

Filename suffix
The text to append to the name of the baseline corrected raw data file.
Chromatogram type
TIC: total ion count, i.e. summed intensities per scan, or
Base peak intensity: maximum intensity per scan.
MS-level
MS level to which to apply correction. Select "0" for all levels.
Smoothing
The smoothing factor. Typically in the range 105 to 108. Larger values produce a smoother baseline.
Asymmetry
The weight (p) for points below the trendline. Conversely, 1-p is the weight applied to points above the trendline. For baselines use a small value of p.
Use m/z bins
Baselines can be calculated and data points corrected per m/z bin or to the entire raw data file. If no binning is performed then a single chromatogram is calculated for the entire raw data file and its baseline used to correct the full data file. No binning is very quick but much less accurate and so is only suitable for fine-tuning the smoothing and asymmetry parameters.
m/z bin width
The width of the m/z bins if binning is performed (see use m/z bins). Smaller bin widths result in longer processing times and greater memory requirements. Avoid values below 0.01.
Remove source file
Whether to remove the original raw data file once baseline correction is complete.

Requirements

This module relies on the R statistical computing software being installed and two "packages" being installed in R:

  1. ptw: parametric time-warping provides the asymmetric least-squares implementation. To install ptw run R and enter
    install.packages("ptw")
  2. rJava: provides an interface between MZmine and R. To install rJava run R and enter
    install.packages("rJava")

You will also need to correctly configure several environment variables in your MZmine start-up script (startMZmine_Windows.bat, startMZmine_Linux.sh or startMZmine_MacOSX.command):

R_HOME
This is the directory where R is installed, e.g. for Windows it will be something like C:\Program Files\R\R-2.15.0.
R_LIBS_USER
This is the directory in which R installs third-party packages. It's usually a sub-directory of your personal directory, e.g. for Windows it will be something like %USERPROFILE%\Documents\R\win-library\2.15.
PATH
Append the directory that contains R's libraries. It will be a sub-directory of %R_HOME%, e.g. for 32-bit Windows it will be something like %R_HOME%\bin\i386 or for 64-bit Windows %R_HOME%\bin\x64.
JRI_LIB_PATH
This is the directory where rJava has installed its JRI libraries. It will be a sub-directory of %R_LIBS_USER%, e.g. for 32-bit Windows it will be something like %R_LIBS_USER%\rJava\jri\i386 or for 64-bit Windows %R_LIBS_USER%\rJava\jri\x64.

References

[1] Boelens, H.F.M., Eilers, P.H.C., Hankemeier, T. (2005) "Sign constraints improve the detection of differences between complex spectral data sets: LC-IR as an example", Analytical Chemistry, 77, 7998 – 8007.