nuisance is hosted by Hepforge, IPPP Durham
close Warning: Can't synchronize with repository "(default)" ("(default)" is not readable or not a Git repository.). Look in the Trac log for more information.

Version 40 (modified by Clarence Wret, 8 years ago) (diff)

--

How to add a sample in NUISANCE

This is an informal step-by-step guide on how to add samples into the NUISANCE framework. Most of this can be derived by looking at the already implemented measurements, and I'd advise potential readers to have a look around in the src directory in addition to reading this guide.

For this tutorial I'll be adding the T2K CC1pi+ H2O data. The data comes from the T2K site and arxiv. The data is supplied both in a ROOT file and a number of .csv files: I'll be using the ROOT file here.

After this we'll be able to make data/MC comparisons to the T2K CC1pi+ H2O data-set for NEUT, GENIE, GiBUU and NuWro; all in the same framework. We'll get MC predictions, chi2 values, interaction mode contributions, and more using a consistent signal definition, guaranteed the same across the generators.

NUISANCE is currently fully written in C++03 and we would appreciate future contributions to adhere to this standard.

Versions: NUISANCE v1r0, NEUT 5.3.6, arxiv 1605.07964v2.pdf

Examining the data and choosing distributions

Finding the neutrino flux and generating events

The first issue at hand is to find the flux for the experiment. If we don't have this, we cant make a generator prediction---unless the measurement is a total cross-section without any phase space cuts (in which case you should probably cast a suspicious eye).

A quick search through the arxiv document points us to reference [12]. It is also in our flux list.

We then generate events in NEUT 5.3.6 with a suitable card-file, see our NEUT guide on how to do this. We need the correct target (water) and flux, and perform the model selections we want (e.g. Rein-Sehgal or Berger-Sehgal coherent model). The procedure is very similar for other generators too.

Aim to generate around 1M events with all interaction modes turned on. This way we make sure to get all interaction modes into the topologically defined cross-section. We get a small amount of CCQE events which excite a pion from the outgoing nucleon interacting in the nucleus to kick out a pion, leading to a CCQE+1pi ~ CC1pi+ final state, which is signal for this particular sample.

Choosing a cross-section distribution

The T2K CC1pi+ H2O data release contains various distributions in FIG 4. In this tutorial I'll look at adding one kinematic distribution and one derived distribution: pmu and Erecnu shown below.

In NUISANCE we try to add all available distributions from a publication. However, some distributions will have detector effects "folded" into them (i.e. they will be raw detector-level data). We can only use these if there is some method which bring detector-level variables (e.g. p_mu seen in the detector) to truth-level variable (e.g. p_mu seen after correcting for the detector effects).

Preparing structure for a sample

NUISANCE is designed to easily allow adding new samples.

Each experimental measurement is its own class. To simplify including new samples we supply a base class for 1D (Measurement1D) and 2D (Measurement2D) measurements, which in turn inherits from the virtual base class (MeasurementBase).

These MeasurementBase classes are then looped over in the executables and data/MC comparisons are the result. A class has to at least inherit from MeasurementBase and implement the necessary methods to be "looped over". The recommended method is to use and expand the supplied Measurement1D and Measurement2D classes; anything else may require expert knowledge and won't be covered here.

The inheritance tree is simple and goes Specific_Measurement -> MeasurementXD -> MeasurementBase

Naming the sample

Some automatic processing is done on loading up the samples to set up generator scaling factors, chi2 calculations and so on. These are simple string comparisons done in the base class constructors, but do place responsibility on the user.

The structure is Experiment_Measurement_Target_DataType_DimensionVar_Neutrino:

  • Experiment is the experiment (for us T2K)
  • Measurement is a suitable name for the cross-section (for us CC1pip)
  • Target is the interaction target (for us H2O)
  • DataType is the measurement type, either Evt for an event rate measurement or XSec for a cross-section measurement (for us XSec)
  • DimensionVar is 1D or 2D followed by a suitable name for the dependent variable (for us 1Dpmu)
  • Neutrino is the neutrino type (for us nu)

Out of these, string comparisons are only made on DataType and DimensionVar. The other identifiers exist to adhere to some standard and keep sample naming tidy and consistent.

What to do:

Following the above convention, we end up with T2K_CC1pip_H2O_XSec_1Dpmu_nu: there is no ambiguity what the class describes and there is no way of confusing it with other classes in NUISANCE.

Placing the sample in a directory

As with many packages, NUISANCE has source code in the src directory. We have a directory for each experiment inside src, e.g. src/T2K, src/MINERvA, and so on.

The applications using the src files go in the app directory. However, including a new sample does not involve changing anything there, so let's ignore it for the purpose of this tutorial.

The new files themselves for the new samples should be identical (or very similar) to the sample name to avoid confusion.

What to do:

Make the new files src/T2K/T2K_CC1pip_H2O_XSec_1Dpmu_nu.cxx and src/T2K/T2K_CC1pip_H2O_1Dpmu_nu.h. These are the implementation and header files for the new measurement.

Placing the data in a directory

The data for the measurement goes into the data directory.

As with the src directory we have sub-directories for each experiment, e.g. data/T2K and data/MINERvA.

Furthermore, we specify another directory for the measurement topology to avoid confusion, e.g. data/T2K/CC1pip for our sample.

In some cases we might have the same topology defining the cross-section but for different targets. In this case we add another sub-directory for the target, e.g. data/T2K/CC1pip/H2O.

What to do:

Put the ROOT file from the data release into the data/T2K/CC1pip/H2O directory.


Coding up a sample

Now that we have the rough structure set-up, we can finally start writing some code.

Writing the header

In the case of our T2K CC1pi+ H2O data, we're dealing with 1D distributions. They should therefore inherit from the Measurement1D base class, as mentioned earlier. The Measurement1D class is implemented in src/FitBase/Measurement1D.cxx.

NUISANCE requires a constructor and destructor for the class, and we'll need to overload MeasurementBase methods which define the dependent variable(s) (pmu in our case) and what our signal is (CC interaction with one muon and one positive pion with no other pions or mesons and any number of nucleons in our case). The MeasurementBase functions which we need to overload are MeasurementBase::isSignal(FitEvent *event) and MeasurementBase::FillEventVariables(FitEvent *event).

The FitEvent class is an object which contains information about one single event: all the particles, their kinematics and their status after the interaction, the interaction channel which produced the final state, possible secondary interactions, the interaction target, and so on. The FitEvent class is implemented in src/FitBase/FitEvent.cxx and essentially the generator-independent common event format which NUISANCE uses for the generator events.

We see that the src/T2K directory contains T2K_SignalDef.cxx. This extends the SignalDef namespace which holds many of the signal definitions for the classes. If you're implementing multiple distributions for the same measurement (e.g. dSig/dpmu and dSig/dcosmu) we recommend adding the signal definition to the namespace rather than directly in the isSignal(FitEvent *event) function in the class implementation. We'll talk about implementing a signal definition later.

What to do: Our header file now looks like:

#ifndef T2K_CC1PIP_H2O_XSEC_1DPMU_NU_H_SEEN 
#define T2K_CC1PIP_H2O_XSEC_1DPMU_NU_H_SEEN

#include "Measurement1D.h"
#include "T2K_SignalDef.h"

class T2K_CC1pip_H2O_XSec_1Dpmu_nu : public Measurement1D {
public:
  T2K_CC1pip_H2O_XSec_1Dpmu_nu(std::string inputfile, FitWeight *rw, std::string  type, std::string fakeDataFile);
  virtual ~T2K_CC1pip_H2O_XSec_1Dpmu_nu() {};

  void FillEventVariables(FitEvent *event);
  bool isSignal(FitEvent *event);

  private:
};

#endif

where we've copied the constructor arguments from some other class we're using as a template. I'll elaborate more on this below.

Making the constructor

NUISANCE loads samples through FCN/SampleList.cxx in the function SampleUtils::LoadSample, which makes a std::list of MeasurementBase pointers which it later loops over from the executables.

Most of the automated setting up of histograms, titles, and so on is done in Measurement1D::SetupMeasurement(std::string inputfile, std::string type, FitWeight *rw, std::string fakeDataFile). It hence makes sense to have a class constructor which passes all of these on to SetupMeasurement.

The arguments to Measurement1D::SetupMeasurement are:

  • inputfile: the location of the input file containing generated events
  • type: the type of comparison we want to make with this sample
  • rw: the reweighting engine which may or may not be used to reweight the sample
  • fakeDataFile: the location of a fake-data input file which is to be used as data instead of the actual data for fake-data studies

which currently all have to be passed. In reality, both rw and fakeDataFile do nothing in SetupMeasurement but are left for legacy reasons. We will likely change this soon!

Looking at the base classes which we are inheriting our sample from (Measurement1D and MeasurementBase, both in src/FitBase), we minimally need to set:

  • std::string fName: The name of the sample
  • std::string fPlotTitles: The x and y axes of the sample
  • double EnuMin: The minimum neutrino energy in the sample signal definition
  • double EnuMax: The maximum neutrino energy in the sample signal definition
  • TH1D *fDataHist: The data histogram scanned from the provided input
  • TH1D *fMCHist: The generator histogram binning with same units as fDataHist
  • TH1D *fMCStat: The generator histogram without any scaling (i.e. Nevents)
  • TH1D *fMCFine: The generator histogram with same units as fDataHist but finer binning
  • TH1 **fMCHist_PDF: The mode-by-mode generator histograms
  • TMatrixDSym *fFullCovar: The covariance matrix
  • TMatrixDSym *covar: The inverted covariance matrix (bad naming, sorry!)
  • double fScaleFactor: The scaling factor we need to go from a event rate to a cross-section (Nevt(pmu) -> dSig(pmu))

We can set most of these by calling helper functions in Measurement1D, such as

  • Measurement1D::SetupMeasurement: checks the naming of the sample and sets up the protected/private members, sets up the input generator processing for the sample, sets the fit options
  • Measurement1D::SetDataValues: sets the data histogram from a given file
  • Measurement1D::SetCovarMatrix: sets up the covariance matrices needed to calculate likelihoods
  • Measurement1D::SetupDefaultHist: sets all the generator histograms and their auxillary histograms (e.g. mode-by-mode generator histograms)

or similar functions, e.g. calling Measurement1D::SetDataFromFile instead of Measurement1D::SetDataValues. Feel free to add reasonable setter functions to the base classes if you can't find a suitable one'''

Current NUISANCE (v1r0) behaviour requires fName, fPlotTitls, EnuMin, EnuMax and fScaleFactor to be set by user in the implementation file. These are the only variables that need to be supplied by the user in the constructor, granted above helper functions are used.

For non-expert use, it's sufficient to just copy the fScaleFactor variable as (fEventHist->Integral("width")*1E-38)/double(fNEvents)/TotalIntegratedFlux("width");. This number is applied to the event rate histogram to take us to a cross-section histogram in Measurement1D::ScaleEvents.

If the cross-section is not in cm2/nucleon but instead cm2/CH (or any other molecule) you will have to scale the fScaleFactor by the number of available target nucleons.

For CCQE, numu interactions only occur off neutrons, which we have 6 of in CH. But the total number of nucleons in CH is 13. So if an experiment reports their cross-section in cm2/CH for CCQE numu and our generator output is cm2/nucleon we need to do fScaleFactor*=13./6..

If instead an experiment measures a CC1pi+ cross-section in cm^2^/CH, we scale by a factor 13 because CC1pi+ happens on both neutrons and protons.

It's possible that these will be automated in the future, so please look at recently implemented samples for up-to-date references.

#include "T2K_CC1pip_H2O_XSec_1Dpmu_nu.h"

// The muon momentum

//******************************************************************** 
T2K_CC1pip_H2O_XSec_1Dpmu_nu::T2K_CC1pip_H2O_XSec_1Dpmu_nu(std::string inputfile, FitWeight *rw, std::string  type, std::string fakeDataFile){
//******************************************************************** 

  fName = "T2K_CC1pip_H2O_XSec_1Dpmu_nu";
  fPlotTitles = "; p_{#mu} (GeV/c); d#sigma/dp_{#mu} (cm^{2}/(GeV/c)/nucleon)";
  EnuMin = 0.; 
  EnuMax = 10.;
  Measurement1D::SetupMeasurement(inputfile, type, rw, fakeDataFile);

  // Data comes in ROOT file
  // hResultTot is cross-section with all errors
  // hResultStat is cross-section with stats-only errors
  // hTruthNEUT is the NEUT cross-section given by experimenter
  // hTruthGENIE is the GENIE cross-section given by experimenter
  SetDataFromFile(GeneralUtils::GetTopLevelDir()+"/data/T2K/CC1pip/H2O/nd280data-numu-cc1pi-xs-on-h2o-2015.root","MuMom/hResultTot");
  SetCovarFromDataFile(GeneralUtils::GetTopLevelDir()+"/data/T2K/CC1pip/H2O/nd280data-numu-cc1pi-xs-on-h2o-2015.root", "MuMom/TotalCovariance", true);

  SetupDefaultHist();

  fScaleFactor = (fEventHist->Integral("width")*1E-38)/double(fNEvents)/TotalIntegratedFlux("width");
};

Specifying the event-level dependent variable(s)

Specifying a signal definition

The final implementation

T2K_CC1pip_H2O_XSec_1Dpmu_nu.cxx:

#include "T2K_CC1pip_H2O_XSec_1Dpmu_nu.h"

// The muon momentum

//******************************************************************** 
T2K_CC1pip_H2O_XSec_1Dpmu_nu::T2K_CC1pip_H2O_XSec_1Dpmu_nu(std::string inputfile, FitWeight *rw, std::string  type, std::string fakeDataFile){
//******************************************************************** 

  fName = "T2K_CC1pip_H2O_XSec_1Dpmu_nu";
  fPlotTitles = "; p_{#mu} (GeV/c); d#sigma/dp_{#mu} (cm^{2}/(GeV/c)/nucleon)";
  EnuMin = 0.;
  EnuMax = 10.;
  Measurement1D::SetupMeasurement(inputfile, type, rw, fakeDataFile);

  // Data comes in ROOT file
  // hResultTot is cross-section with all errors
  // hResultStat is cross-section with stats-only errors
  // hTruthNEUT is the NEUT cross-section given by experimenter
  // hTruthGENIE is the GENIE cross-section given by experimenter
  SetDataFromFile(GeneralUtils::GetTopLevelDir()+"/data/T2K/CC1pip/H2O/nd280data-numu-cc1pi-xs-on-h2o-2015.root","MuMom/hResultTot");
  SetCovarFromDataFile(GeneralUtils::GetTopLevelDir()+"/data/T2K/CC1pip/H2O/nd280data-numu-cc1pi-xs-on-h2o-2015.root", "MuMom/TotalCovariance", true);

  SetupDefaultHist();

  fScaleFactor = (fEventHist->Integral("width")*1E-38)/double(fNEvents)/TotalIntegratedFlux("width");
};

//******************************************************************** 
// Find the momentum of the muon
void T2K_CC1pip_H2O_XSec_1Dpmu_nu::FillEventVariables(FitEvent *event) {
//******************************************************************** 

  // Need to make sure there's a muon
  if (event->NumFSParticle(13) == 0) return;

  // Get the muon
  TLorentzVector Pmu  = event->GetHMFSParticle(13)->fP;

  double p_mu = FitUtils::p(Pmu);

  fXVar = p_mu;

  return;
};

//******************************************************************** 
// Beware: The H2O analysis has different signal definition to the CH analysis!
bool T2K_CC1pip_H2O_XSec_1Dpmu_nu::isSignal(FitEvent *event) {
//******************************************************************** 
  return SignalDef::isCC1pip_T2K_H2O(event, EnuMin, EnuMax);
}

T2K_CC1pip_H2O_XSec_1Dpmu_nu.h:

#ifndef T2K_CC1PIP_H2O_XSEC_1DPMU_NU_H_SEEN 
#define T2K_CC1PIP_H2O_XSEC_1DPMU_NU_H_SEEN

#include "Measurement1D.h"
#include "T2K_SignalDef.h"

class T2K_CC1pip_H2O_XSec_1Dpmu_nu : public Measurement1D {
public:
  T2K_CC1pip_H2O_XSec_1Dpmu_nu(std::string inputfile, FitWeight *rw, std::string  type, std::string fakeDataFile);
  virtual ~T2K_CC1pip_H2O_XSec_1Dpmu_nu() {};

  void FillEventVariables(FitEvent *event);
  bool isSignal(FitEvent *event);

  private:
};

#endif

Make NUISANCE aware of the sample

FCN/SampleList

Add the sample to the makefile

CMake

Attachments (2)

Download all attachments as: .zip