nuisance is hosted by Hepforge, IPPP Durham
close Warning: Can't synchronize with repository "(default)" ("(default)" is not readable or not a Git repository.). Look in the Trac log for more information.

Version 5 (modified by Patrick Stowell, 8 years ago) (diff)

--

Page BUILD Built

GENIE NUISMIN

PAGE BEING BUILT

The NUISANCE minimizer application can be used to tune MC models by directly comparing with published scattering data and using ROOT's minimizer libraries to find a best fit parameter set.

Wiki content

  1. Page BUILD Built
  2. GENIE NUISMIN
  3. PAGE BEING BUILT
  4. Running NUISMIN
  5. Running a GENIE Minimization
    1. Generating GENIE Events
      1. Running gevgen
      2. Running PrepareGENIE
    2. Choosing our samples
      1. Writing a card file
      2. Reweighting GENIE Comparisons
      3. Running the minimiser (Migrad)
      4. Analysing the output
    3. Alternative Fitting Routines
      1. Running GSL Minimiser Routines
      2. Fixing dials at limits
      3. Running a 1D likelihood scan
      4. Running a 2D likelihood scan
      5. Running a Contour Scan
      6. Generating Post-fit Error Bands
    4. Running a fake data fit

Running NUISMIN

Author: Patrick Stowell

Date: June 2017

Versions: NUISANCE v2r0, GENIE 2.12.6

The following example details how to run NUISANCE and tune a simple model to MiniBooNE_CCQE_XSec_1DQ2_nu data, with additional examples on how to include penalty terms and perform fake data fits. This example assumes you have at least tried to run the nuiscomp tutorial shown here HowToUseNUISCOMP and understand what the standard NUISANCE sample outputs look like. If you haven't tried the nuiscomp tutorial you should start there first as its relatively quick and an easier place to start.

Each generator requires very slightly different ways to handle NUISANCE, therefore multiple versions of this tutorial have been provided. Please use the following links to choose what generator you would like to use.

Running a GENIE Minimization

In this example we will take pregenerated GENIE events as our starting model and run a tuning to find a best fit value for some of the CCQE Reweight dials in GENIE ReWeight?.

Generating GENIE Events

If we want to see how a given GENIE model behaves, first we need to generate events. This can be done using the standard gevgen application, using the appropriate target and flux for a given data sample.

If all you want to do is check your NUSIANCE is built correctly, you can skip this step by downloading MC files from our online storage area by following the steps found here:LinkToNUISANCEMCFiles

Running gevgen

The standard gevgen application options can be ran using

 $ gevgen -h
Syntax:

      gevgen [-h]
              [-r run#]
               -n nev
               -e energy (or energy range)
               -p neutrino_pdg
               -t target_pdg
              [-f flux_description]
              [-o outfile_name]
              [-w]
              [--seed random_number_seed]
              [--cross-sections xml_file]
              [--event-generator-list list_name]
              [--message-thresholds xml_file]
              [--unphysical-event-mask mask]
              [--event-record-print-level level]
              [--mc-job-status-refresh-rate  rate]
              [--cache-file root_file]

We need to provide a flux and target list to GENIE when running gevgen.

We want to run comparisons to MiniBooNE muon-neutrino scattering data on a mineral oil target, therefore to generate events we pass it the default GENIE splines, the MiniBooNE flux in root format, and the required target and beam peg settings

$ source $GENIE_DIR/environment_setup.sh
$ export GXMLPATH=${GENIE_DIR}/genie_xsec/v2_12_0/NULL/DefaultPlusMECWithNC/data/
$ gevgen -n 2500000 \
                -t 1000060120[0.85714],1000010010[0.14286] \ 
                -p 14 --cross-sections $GXMLPATH/gxspl-FNALsmall.xml \
                --event-generator-list Default -f MiniBooNE_numu_flux.root,numu_mb \
                -e 0,10 -o gntp.2063030.ghep.root

For more information on how to generate events in GENIE please see: https://arxiv.org/abs/1510.05494

Once our events have been generated we can check that they have finished correctly by opening the output event file and checking it has a ‘gtree’ object and the checking flux spectrum file contains the correct histogram.

$ root gntp.2063030.ghep.root
root [0]
Attaching file gntp.2063030.ghep.root as _file0...
Warning in <TClass::TClass>: no dictionary for class genie::NtpMCEventRecord is available
Warning in <TClass::TClass>: no dictionary for class genie::NtpMCRecordI is available
Warning in <TClass::TClass>: no dictionary for class genie::NtpMCRecHeader is available
Warning in <TClass::TClass>: no dictionary for class genie::EventRecord is available
Warning in <TClass::TClass>: no dictionary for class genie::GHepRecord is available
Warning in <TClass::TClass>: no dictionary for class genie::Interaction is available
Warning in <TClass::TClass>: no dictionary for class genie::InitialState is available
Warning in <TClass::TClass>: no dictionary for class genie::Target is available
Warning in <TClass::TClass>: no dictionary for class genie::ProcessInfo is available
Warning in <TClass::TClass>: no dictionary for class genie::Kinematics is available
Warning in <TClass::TClass>: no dictionary for class genie::XclsTag is available
Warning in <TClass::TClass>: no dictionary for class genie::KPhaseSpace is available
Warning in <TClass::TClass>: no dictionary for class genie::GHepParticle is available
Warning in <TClass::TClass>: no dictionary for class pair<genie::EKineVar,double> is available
root [1] _file0->ls();
TFile**		gntp.2063030.ghep.root
 TFile*		gntp.2063030.ghep.root
  KEY: genie::NtpMCTreeHeader	header;1	GENIE output tree header
  KEY: TFolder	gconfig;1	GENIE configs
  KEY: TFolder	genv;1	GENIE user environment
  KEY: TTree	gtree;1	GENIE MC Truth TTree, Format: [NtpMCEventRecord]
$ root input-flux.root
root [0]
Attaching file input-flux.root as _file0...
root [1] _file0->ls();
TFile**		input-flux.root
 TFile*		input-flux.root
  KEY: TH1D	spectrum;1	neutrino_flux

Now that the event samples have been generated correctly, we need to prepare them for use in NUISANCE.

Running PrepareGENIE

The standard gevgen application doesn’t save the total event rate predictions into the event file itself. NUISANCE needs these to correctly normalise predictions so before we can use these new events we need to prepare them.

The PrepareGENIE application is built when NUISANCE is built with GENIE support should be available after the NUISANCE environmental setup script is ran.

$ PrepareGENIE -h
PrepareGENIEEvents NUISANCE app.
Takes GHep Outputs and prepares events for NUISANCE.

PrepareGENIEEvents  [-h,-help,--h,--help] 
                                    [-i inputfile1.root,inputfile2.root,inputfile3.root,...] 
                                    [-f flux_root_file.root,flux_hist_name] 
                                    [-t target1[frac1],target2[frac2],...]

Prepare Mode [Default] : Takes a single GHep file, reconstructs the original GENIE splines,  and creates a duplicate file that 
also contains the flux, event rate, and xsec predictions that NUISANCE needs.
Following options are required for Prepare Mode:
 [ -i inputfile.root  ] : Reads in a single GHep input file that needs the xsec calculation ran on it.
 [ -f flux_file.root,hist_name ] : Path to root file containing the flux histogram the GHep records were generated with. A 
simple method is to point this to the flux histogram genie generatrs '-f /path/to/events/input-flux.root,spectrum'.
 [ -t target ] : Target that GHepRecords were generated with. Comma seperated list. E.g. for CH2 
target=1000060120,1000010010,1000010010

The PrepareGENIE application, when ran, loops over all the events, extracts the cross-section as a function of energy for each discrete interaction mode and uses this information to reconstruct the cross-section splines for each target that were used to generate events.

These splines are then multiplied by specified flux and added according to the target definition provided to produce total flux and event rate predictions as a function of energy for the sample and saves them into the events file.

We want to prepare our MiniBooNE events so we pass in the event files, the input flux, and the CH2 target definition.

 $ PrepareGENIE -i gntp.2063030.ghep.root -f input-flux.root,spectrum -t  1000060120,1000010010,1000010010  

Now when we open our event file again, we should see the flux and event rate histograms are now saved into the file ready for NUISANCE to read them.

  KEY: genie::NtpMCTreeHeader	header;1	GENIE output tree header
  KEY: TFolder	gconfig;1	GENIE configs
  KEY: TFolder	genv;1	GENIE user environment
  KEY: TTree	gtree;1	GENIE MC Truth TTree, Format: [NtpMCEventRecord]
  KEY: TDirectoryFile	IndividualGENIESplines;1	IndividualGENIESplines
  KEY: TDirectoryFile	TargetGENIESplines;1	TargetGENIESplines
  KEY: TH1F	nuisance_xsec;1
  KEY: TH1F	nuisance_events;1
  KEY: TH1F	nuisance_flux;1

Choosing our samples

Now that we have an event sample we can load them load them into NUISANCE so that it can use them to form a joint likelihood by specifying them at run time.

Writing a card file

To specify samples we need to write a NUISANCE card file that lists all comparisons that should be made and the event files that should be used for each one.

We want to produce comparisons to MiniBooNE CCQE data, so first we should search the NUISANCE sample list.

The ‘nuissamples’ script is provided for easy access of the sample list. Running it without any arguments will return a full sample list of available data comparisons. Providing an additional argument will return only samples containing the provided substring.

We can list the MIniBooNE samples using

[stowell@hepgw1 ~]$ nuissamples MiniBooNE
MiniBooNE_CCQE_XSec_1DQ2_nu
MiniBooNE_CCQELike_XSec_1DQ2_nu
MiniBooNE_CCQE_XSec_1DQ2_antinu
MiniBooNE_CCQELike_XSec_1DQ2_antinu
MiniBooNE_CCQE_CTarg_XSec_1DQ2_antinu
MiniBooNE_CCQE_XSec_2DTcos_nu
MiniBooNE_CCQELike_XSec_2DTcos_nu
MiniBooNE_CCQE_XSec_2DTcos_antinu
MiniBooNE_CCQELike_XSec_2DTcos_antinu
MiniBooNE_CC1pip_XSec_1DEnu_nu
MiniBooNE_CC1pip_XSec_1DQ2_nu
MiniBooNE_CC1pip_XSec_1DTpi_nu
MiniBooNE_CC1pip_XSec_1DTu_nu
MiniBooNE_CC1pip_XSec_2DQ2Enu_nu
MiniBooNE_CC1pip_XSec_2DTpiCospi_nu
MiniBooNE_CC1pip_XSec_2DTpiEnu_nu
MiniBooNE_CC1pip_XSec_2DTuCosmu_nu
MiniBooNE_CC1pip_XSec_2DTuEnu_nu
MiniBooNE_CC1pi0_XSec_1DEnu_nu
MiniBooNE_CC1pi0_XSec_1DQ2_nu
MiniBooNE_CC1pi0_XSec_1DTu_nu
MiniBooNE_CC1pi0_XSec_1Dcosmu_nu
MiniBooNE_CC1pi0_XSec_1Dcospi0_nu
MiniBooNE_CC1pi0_XSec_1Dppi0_nu
MiniBooNE_NC1pi0_XSec_1Dcospi0_antinu
MiniBooNE_NC1pi0_XSec_1Dcospi0_rhc
MiniBooNE_NC1pi0_XSec_1Dcospi0_nu
MiniBooNE_NC1pi0_XSec_1Dcospi0_fhc
MiniBooNE_NC1pi0_XSec_1Dppi0_antinu
MiniBooNE_NC1pi0_XSec_1Dppi0_rhc
MiniBooNE_NC1pi0_XSec_1Dppi0_nu
MiniBooNE_NC1pi0_XSec_1Dppi0_fhc
MiniBooNE_NCEL_XSec_Treco_nu

We only care about CC1pip data therefore the following samples are of interest

[stowell@hepgw1 ~]$ nuissamples MiniBooNE_CCQE_
MiniBooNE_CCQE_XSec_1DQ2_nu
MiniBooNE_CCQE_XSec_1DQ2_antinu
MiniBooNE_CCQE_CTarg_XSec_1DQ2_antinu
MiniBooNE_CCQE_XSec_2DTcos_nu
MiniBooNE_CCQE_XSec_2DTcos_antinu

In this example we will compare to the CCQE 1DQ2 distributions, but we could provide a large number of the samples seen in the lists if we wanted. When we specify multiple samples nuismin will automatically calculate the likelihood for each one and add them uncorrelated to form a joint likelihood.

We write our card file with these two datasets using the following sample object format:

sample NAME_OF_SAMPLE  INPUT_TYPE:FILE_INPUT    
  • NAME_OF_SAMPLE : Name of the sample we found using nuissamples
  • INPUT_TYPE : Type of the input file we are using (e.g. GENIE)
  • FILE_INPUT : Path to the input MC event we want to use for this sample.
  • OPTION : (Optional Argument) Option that can be used to change sample behaviour at runtime. By default this is left as DEFAULT.
  • NORM_VALUE : (Optional Argument) Start value of normalisation parameter used to change the MC normalisation for this sample. By default this is left at 1.0.

For further examples on how to include these structures in card files please see Card File Examples.

So if our GENIE file is called 'MiniBooNE_FHC_numu_2.5M.root', our card file will be :

genie_tutorial.card

sample MiniBooNE_CCQE_XSec_1DQ2_nu GENIE:MiniBooNE_FHC_numu_2.5M.root
sample MiniBooNE_CCQE_XSec_1DQ2_antinu GENIE:MiniBooNE_FHC_numu_2.5M.root

If you want to check this is valid you could run this card file with 'nuiscomp' as shown in the tutorial here HowToUseNUISCOMP-GENIE and check the nominal prediction looks okay.

Reweighting GENIE Comparisons

The minimiser application allows different GENIE reweight parameters to be provided in our card file and floated freely until the likelihood is maximised.

The format for fixed dial values is:

genie_parameter  NAME   VALUE   STATE

or for free dial variations

genie_parameter  NAME   VALUE  LOW  HIGH  STEP STATE
  • NAME : Specifies the name of the reweight dial. These can be found in '$GENIE/src/ReWeight/GSyst.cxx'
  • VALUE: Value of the reweight dial in units of 1-sigma variations (1-sigma defined by GENIE)
  • LOW : Lower limit that the dial is allowed to float to.
  • HIGH : Upper limit that the dial is allowed to float to.
  • STEP : Step size passed to Minuit.
  • STATE: State of this parameter dial, for most cases this should be left as 'FIX'.

For further examples on how to include these structures in card files please see Card File Examples.

In our example we want to tune the following parameters

  • Charged Current Quasielastic Axial Mass : Axial mass parameter used in the quasielastic form factor
  • Charged Current Quasielastic Normalisation : Total normalisation of CCQE events.

First we look for the possible name in GENIE reweight:

$GENIE/src/ReWeight/GSyst.h

class GSyst {
public:
 //......................................................................................
 static string AsString(GSyst_t syst)
 {
   switch(syst) {
     case ( kXSecTwkDial_MaNCEL           ) : return "MaNCEL";               break;
     case ( kXSecTwkDial_EtaNCEL          ) : return "EtaNCEL";              break;
     case ( kXSecTwkDial_NormCCQE         ) : return "NormCCQE";             break;
     case ( kXSecTwkDial_NormCCQEenu      ) : return "NormCCQEenu";          break;
     case ( kXSecTwkDial_MaCCQE           ) : return "MaCCQE";               break;
     case ( kXSecTwkDial_MaCCQEshape      ) : return "MaCCQEshape";          break;
     case ( kXSecTwkDial_ZNormCCQE        ) : return "ZNormCCQE";            break;
     ...

We can see the dials we are interested in this list, specified by the strings: 'MaCCQE' and 'NormCCQE' respectively.

Now we want to change include these dials in our card file. To start with we will try tuning a single parameter, keeping NormCCQE fixed at +0.1 sigma instead. Although we are keeping it fixed, we define the limits on NormCCQE anyway, in case we want to vary it later.

genie_parameter MaCCQE    0.0  -3.0  3.0  1.0  FREE
genie_parameter NormCCQE  0.1  -3.0  3.0  1.0  FIX

So now our card file looks like the following:

genie_tutorial.card

genie_parameter MaCCQE    0.0  -3.0  3.0  1.0  FREE
genie_parameter NormCCQE  0.1  -3.0  3.0  1.0  FIX

sample MiniBooNE_CCQE_XSec_1DQ2_nu GENIE:MiniBooNE_FHC_numu_2.5M.root
sample MiniBooNE_CCQE_XSec_1DQ2_antinu GENIE:MiniBooNE_FHC_numu_2.5M.root 

Running the minimiser (Migrad)

Now that we have a prepared card file from the previous section, we can now run the minimiser.

Lets start it running, we will use a smaller number of events for this example so we can get some idea of the behaviour. We do this by passing the '-n' option at the command line.

WARNING this is only so you can quickly see the results of minuit and check that the minimiser and your card file is working correctly. For serious fitting you should never use the '-n' flag.

 $ nuismin -c genie_tutorial.card -o genie_minuit_maccqe-free_normccqe-fixed.root -n 50000

Now that we have left that running in the background, lets quickly discuss how Migrad works and some options you have to make NUISANCE run a bit quicker.

How Migrad works

Migrad is the default minimisation routine that nuismin uses. It attempts to find a best fit parameter set that maximises a likelihood (minimises a chi-squared value). When we run nuismin, the NUISANCE routines classes tells Migrad that the parameters we have defined as FREE should be treated as registered as free variables and the samples we have listed should be used to form a likelihood.

Then when Migrad is run, it will step around the parameter space provided to it, calculate a likelihood given the parameters at the new step, and use a gradient descent algorithm to work its way to a best fit parameter set. Each of these 'steps' is referred to as an iteration, and usually a large number of iterations are required so that Migrad can find a reliable minimum.

To calculate the likelihood at each step NUISANCE has to recalculate all of the MC predictions for the given parameter set at the step, before using them to calculate a data-MC likelihood. Since we have to loop over the event sample at each iteration, this can take a while to actually converge, hence why for this example we have restricted the number of events.

Eventually when Migrad does think it has found a best fit point, it will step around the minimum checking that it is a good valid minimum, and assuming Gaussian statistics, calculate errors and correlations on each of the parameters.

One issue that a user should be aware of is that Migrad can sometimes fall into local minima, which is especially more likely if the likelihood surface is non-linear. This can result in bias fit results, and it is advisable that extra closure tests are performed by start the minimiser from different starting values (changing the NOMINAL value in the parameter card file definition) and checking that the fit results are stable.

Signal Reconfigures

The fact the Migrad has to iterate many times can be very time consuming. With large event samples it is not uncommon for tunings with multiple parameters to take days since to calculate a likelihood for a given iteration NUISANCE has to recalculate all the MC predictions for the updated parameter set before comparing the data to MC. This involves reading every single event in our samples at each iteration, recalculating event weights using our reweight engines, filling each of the histograms with the updated weight, and then finally calculating a likelihood for all samples included. The two most time consuming pieces of this iteration are the reading and event weight calculation stages, and as such a lot of event handling functionality has been added to the core classes of NUISANCE to minimise iteration time. This is referred to as the 'EventManager?'.

The EventManager?'s job is to keep a global list of all possible inputs and what samples they are assigned to, making sure that no event is read from disk twice or event weight is calculated twice during a single iteration. When the EventManager? is switched on 'config EventManager? 1' which it should be by default, during a single event loop, first the EventManager? will read an event and calculate a weight before distributing that event information to only the relevant samples saving a significant amount of time compared to performing an event loop for each sample individually.

To increase the speed of the EventManager? even further a 'SignalReconfigures?=1' option has been added which is not enabled by default, but that makes sure that only events that have been flagged as signal by the sample classes actually get read from disk and weights calculated. This mode is available but has not been extensively tested with all samples yet, so use it with caution.

Saving Nominal Prediction

A configuration option is alose provided that forces NUISANCE to run an additional reconfigure step when starting nuismin that will save the starting MC predictions into a seperate folder in the output file called 'nominal/', so that pre-fit vs post-fit comparisons can be made.

This can be turned on by using the configuration option

config savenominal 1

Analysing the output

You'll notice that some of the nuismin output file is very similar to the output of nuiscomp. The nuismin application builds on those outputs by adding additional information on the fit results.

Data/MC Comparisons

The data/MC comparisons plots (e.g. MiniBooNE_CCQE_XSec_1DQ2_nu_MC) should have the same structure as that of the nuiscomp app, with one slight difference. The MC plots should 'hopefully' look a lot more like the data now.

nuismin should save the MC prediction at the best fit point of the minimisation into the output file, so if we compare the result to the same configuration ran through nuiscomp (before tuning) we should see an improvement.

If there is not an improvement then it means the minimiser failed in some way! Comparing before and after is a very qualitative way to understand the fitter progress however. Instead nuismin also saves a bunch of handy information detailing the steps taken along the way and the best fit results, detailed in the next sections.

Iteration Tree

Parameter Tuning Plots

Alternative Fitting Routines

Running GSL Minimiser Routines

Fixing dials at limits

Running a 1D likelihood scan

Running a 2D likelihood scan

Running a Contour Scan

Generating Post-fit Error Bands

Running a fake data fit