Package fim

Class JNIFIM

java.lang.Object
fim.JNIFIM

public class JNIFIM extends Object
Class for Java interface to frequent item set mining in C
Since:
2014.09.26
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final int
    processing mode: add all (closed) item sets to repository (for Carpenter algorithm)
    static final int
    target type: all frequent item sets
    static final int
    item may appear only in a rule body/antecedent
    static final int
    item may appear only in a rule body/antecedent
    static final int
    Apriori variant: basic algorithm
    static final int
    algorithm variant: automatic choice (always applicable)
    static final int
    aggregation mode: average of individual measure values
    static final int
    aggregation mode: average of individual measure values
    static final int
    item may appear only in a rule body/antecedent
    static final int
    item may appear anywhere in a rule
    static final int
    item may appear anywhere in a rule
    static final int
    Carpenter variant: transaction identifier lists
    static final int
    Carpenter variant: item occurrence counter table
    static final int
    Carpenter variant: transaction identifier lists
    static final int
    Carpenter variant: transaction identifier lists
    static final int
    evaluation measure: certainty factor (larger is better)
    static final int
    evaluation measure: certainty factor (larger is better)
    static final int
    evaluation measure: normalized chi^2 measure (larger is better)
    static final int
    evaluation measure: p-value from chi^2 measure (smaller is better)
    static final int
    target type: closed (frequent) item sets
    static final String
    pattern spectrum report format: three columns size (integer), support (integer) and (average) occurrence frequency (double)
    static final int
    evaluation measure: conditional probability ratio (larger is better)
    static final int
    evaluation measure: rule confidence (larger is better)
    static final int
    evaluation measure: absolute confidence difference to prior (larger is better)
    static final int
    evaluation measure: rule confidence (larger is better)
    static final int
    item may appear only in a rule head/consequent
    static final int
    item may appear only in a rule head/consequent
    static final int
    evaluation measure: conviction (larger is better)
    static final int
    evaluation measure: conditional probability ratio (larger is better)
    static final int
    evaluation measure: conviction (larger is better)
    static final int
    evaluation measure: difference of conviction to 1 (larger is better)
    static final int
    evaluation measure: difference of conviction quotient to 1 (larger is better)
    static final int
    Eclat variant: transaction id lists intersection (basic)
    static final int
    Eclat variant: transaction id lists as bit vectors
    static final int
    Eclat variant: transaction id difference sets (diffsets)
    static final int
    Eclat variant: occurrence deliver from transaction lists
    static final int
    Eclat variant: transaction id lists intersection (improved)
    static final int
    Eclat variant: occurrence deliver from transaction lists
    static final int
    Eclat variant: transaction id range lists intersection
    static final int
    Eclat variant: item occurrence table (simplified)
    static final int
    Eclat variant: item occurrence table (standard)
    static final int
    Eclat variant: transaction id lists intersection (improved)
    static final int
    evaluation measure: Fisher's exact test (chi^2 measure) (smaller is better)
    static final int
    evaluation measure: Fisher's exact test (information gain) (smaller is better)
    static final int
    evaluation measure: Fisher's exact test (table probability) (smaller is better)
    static final int
    evaluation measure: Fisher's exact test (support) (smaller is better)
    static final int
    FP-growth variant: complex tree nodes (children and sibling)
    static final int
    FP-growth variant: simple tree nodes (link and parent)
    static final int
    FP-growth variant: top-down processing on a single prefix tree
    static final int
    FP-growth variant: top-down processing of the prefix trees
    static final int
    target type: all frequent item sets
    static final int
    target type: generator (frequent) item sets
    static final int
    target type: generator (frequent) item sets
    static final int
    item may appear only in a rule head/consequent
    static final int
    processing mode: check extensions for closed/maximal item sets with a horizontal scheme (default: use a repository)
    static final int
    surrogate method: identity (keep original data)
    static final int
    item may not appear anywhere in a rule
    static final int
    evaluation measure: importance (larger is better)
    static final int
    evaluation measure: importance (larger is better)
    static final int
    evaluation measure: information difference to prior (larger is better)
    static final int
    evaluation measure: p-value from information difference (smaller is better)
    static final int
    item may appear anywhere in a rule
    static final int
    item may appear only in a rule body/antecedent
    static final int
    processing mode: invalidate evaluation below expected support
    static final int
    IsTa variant: patricia tree (compact prefix tree)
    static final int
    IsTa variant: standard prefix tree
    static final int
    JIM: Baroni--Buser S_B = (x+s)/(x+r)
    static final int
    JIM variant: basic algorithm
    static final int
    JIM: Czekanowski S_D = 2s/(r+s)
    static final int
    JIM: Dice S_D = 2s/(r+s)
    static final int
    JIM: Faith S_F = (s+z/2)/n
    static final int
    JIM: generic measure S = (c_0s +c_1z +c_2n +c_3x) / (c_4s +c_5z +c_6n +c_7x)
    static final int
    JIM: Gower--Legendre S_N = 2(s+z)/(n+s+z)
    static final int
    JIM: Hamming S_M = (s+z)/n
    static final int
    JIM: Jaccard/Tanimoto S_J = s/r
    static final int
    JIM: Kulcynski S_K = s/q
    static final int
    JIM: no cover similarity
    static final int
    JIM: Rogers--Tanimoto S_T = (s+z)/(n+q)
    static final int
    JIM: Russel-Rao S_R = s/n
    static final int
    JIM: Sokal--Michener S_M = (s+z)/n
    static final int
    JIM: Sokal--Sneath 1 S_S = s/(r+q)
    static final int
    JIM: Sokal--Sneath 2 S_N = 2(s+z)/(n+s+z)
    static final int
    JIM: Sokal--Sneath 3 S_O = (s+z)/q
    static final int
    JIM: Sorensen S_D = 2s/(r+s)
    static final int
    JIM: Jaccard/Tanimoto S_J = s/r
    static final int
    evaluation measure: binary logarithm of support quotient (larger is better)
    static final int
    evaluation measure: lift value (confidence divided by prior) (larger is better)
    static final int
    evaluation measure: difference of lift value to 1 (larger is better)
    static final int
    evaluation measure: difference of lift quotient to 1 (larger is better)
    static final int
    aggregation mode: maximum of individual measure values
    static final int
    target type: maximal (frequent) item sets
    static final int
    aggregation mode: maximum of individual measure values
    static final int
    aggregation mode: minimum of individual measure values
    static final int
    aggregation mode: minimum of individual measure values
    static final int
    item may not appear anywhere in a rule
    static final int
    processing mode: do not collate equal transactions (for Carpenter algorithm)
    static final int
    processing mode: do not use a 16-items machine
    static final int
    processing mode: do not use head union tail (hut) pruning (for maximal item sets)
    static final int
    evaluation measure/aggregation mode: none
    static final int
    processing mode: do not use perfect extension pruning
    static final int
    processing mode: do not prune the prefix/patricia tree (for IsTa algorithm)
    static final int
    processing mode: do not sort items w.r.t.
    static final int
    processing mode: do not organize transactions as a prefix tree (for Apriori algorithm)
    static final String
    pattern spectrum report format: objects of type PatSpecElem
    static final int
    processing mode: use original support definition for rules (body & head instead of only body)
    static final int
    item may appear only in a rule head/consequent
    static final int
    processing mode: a-posteriori pruning of infrequent item sets (for Apriori algorithm)
    static final int
    surrogate method: random transaction generation
    static final int
    RElim variant: basic recursive elimination algorithm
    static final int
    processing mode: filter maximal item sets with repository (for Carpenter algorithm)
    static final int
    target type: association rules
    static final int
    SaM variant: basic split and merge algorithm
    static final int
    SaM variant: split and merge with binary search
    static final int
    SaM variant: split and merge with double source buffering
    static final int
    SaM variant: split and merge with transaction prefix tree
    static final int
    target type: all frequent item sets
    static final int
    surrogate method: shuffle table-derived data (columns)
    static final int
    evaluation measure: item set size times cover similarity (larger is better) (only for JIM algorithm)
    static final int
    evaluation measure: rule support (larger is better)
    static final int
    evaluation measure: rule support (larger is better)
    static final int
    surrogate method: permutation by pair swaps
    static final String
    the version string
    static final int
    processing mode: check extensions for closed/maximal item sets with a vertical scheme (default: use a repository)
    static final int
    evaluation measure: normalized chi^2 measure (Yates corrected) (larger is better)
    static final int
    evaluation measure: p-value from chi^2 measure (Yates corrected) (smaller is better)
  • Constructor Summary

    Constructors
    Constructor
    Description
    Constructor.
  • Method Summary

    Modifier and Type
    Method
    Description
    static void
    abort(int state)
    Set the abort state (abort computations or clear abort state).
    static Object[]
    accretion(int[][] tracts, int[] wgts, double supp, int zmin, int zmax, String report, int stat, double siglvl, int maxext, int mode, int[] border)
    Java interface to Accretion algorithm in C.
    static Object[]
    apriacc(int[][] tracts, int[] wgts, double supp, int zmin, int zmax, String report, int stat, double siglvl, int prune, int mode, int[] border)
    Java interface to accretion-style Apriori algorithm in C.
    static Object[]
    apriori(int[][] tracts, int[] wgts, int target, double supp, double conf, int zmin, int zmax, String report, int eval, int agg, double thresh, int prune, int algo, int mode, int[] border, int[][] appear)
    Java interface to Apriori algorithm in C.
    static Object[]
    arules(int[][] tracts, int[] wgts, double supp, double conf, int zmin, int zmax, String report, int eval, double thresh, int mode, int[][] appear)
    Java interface to association rule induction in C.
    static Object[]
    carpenter(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int eval, double thresh, int algo, int mode, int[] border)
    Java interface to Carpenter algorithm in C.
    static Object[]
    eclat(int[][] tracts, int[] wgts, int target, double conf, double supp, int zmin, int zmax, String report, int eval, int agg, double thresh, int prune, int algo, int mode, int[] border, int[][] appear)
    Java interface to Eclat algorithm in C.
    static Object[]
    estpsp(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int equiv, double alpha, int smpls, int seed)
    Estimate a pattern spectrum from data characteristics.
    static Object[]
    fim(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int[] border)
    Java interface to frequent item set mining in C (very simplified interface).
    static Object[]
    fpgrowth(int[][] tracts, int[] wgts, int target, double supp, double conf, int zmin, int zmax, String report, int eval, int agg, double thresh, int prune, int algo, int mode, int[] border, int[][] appear)
    Java interface to FP-growth algorithm in C.
    static Object[]
    genpsp(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int cnt, int surr, int seed, int cpus, int[] ctrl)
    Pattern spectrum generation with surrogate data sets.
    static Object[]
    ista(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int eval, double thresh, int algo, int mode, int[] border)
    Java interface to IsTa algorithm in C.
    static Object[]
    jim(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int eval, double thresh, int covsim, double[] simps, double sim, int algo, int mode, int[] border)
    Java interface to JIM algorithm in C.
    static Object[]
    relim(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int eval, double thresh, int algo, int mode, int[] border)
    Java interface to RElim algorithm in C.
    static Object[]
    sam(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int eval, double thresh, int algo, int mode, int[] border)
    Java interface to SaM algorithm in C.
    static Object[]
    xfim(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int eval, int agg, double thresh, int[] border)
    Java interface to frequent item set mining in C (less simplified interface).

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • VERSION

      public static final String VERSION
      the version string
      See Also:
    • SETS

      public static final int SETS
      target type: all frequent item sets
      See Also:
    • ALL

      public static final int ALL
      target type: all frequent item sets
      See Also:
    • FREQUENT

      public static final int FREQUENT
      target type: all frequent item sets
      See Also:
    • CLOSED

      public static final int CLOSED
      target type: closed (frequent) item sets
      See Also:
    • MAXIMAL

      public static final int MAXIMAL
      target type: maximal (frequent) item sets
      See Also:
    • GENERATORS

      public static final int GENERATORS
      target type: generator (frequent) item sets
      See Also:
    • GENERAS

      public static final int GENERAS
      target type: generator (frequent) item sets
      See Also:
    • RULES

      public static final int RULES
      target type: association rules
      See Also:
    • IGNORE

      public static final int IGNORE
      item may not appear anywhere in a rule
      See Also:
    • NEITHER

      public static final int NEITHER
      item may not appear anywhere in a rule
      See Also:
    • INPUT

      public static final int INPUT
      item may appear only in a rule body/antecedent
      See Also:
    • BODY

      public static final int BODY
      item may appear only in a rule body/antecedent
      See Also:
    • ANTE

      public static final int ANTE
      item may appear only in a rule body/antecedent
      See Also:
    • ANTECEDENT

      public static final int ANTECEDENT
      item may appear only in a rule body/antecedent
      See Also:
    • OUTPUT

      public static final int OUTPUT
      item may appear only in a rule head/consequent
      See Also:
    • CONS

      public static final int CONS
      item may appear only in a rule head/consequent
      See Also:
    • CONSEQUENT

      public static final int CONSEQUENT
      item may appear only in a rule head/consequent
      See Also:
    • BOTH

      public static final int BOTH
      item may appear anywhere in a rule
      See Also:
    • INOUT

      public static final int INOUT
      item may appear anywhere in a rule
      See Also:
    • CANDA

      public static final int CANDA
      item may appear anywhere in a rule
      See Also:
    • NONE

      public static final int NONE
      evaluation measure/aggregation mode: none
      See Also:
    • SUPPORT

      public static final int SUPPORT
      evaluation measure: rule support (larger is better)
      See Also:
    • SUPP

      public static final int SUPP
      evaluation measure: rule support (larger is better)
      See Also:
    • CONFIDENCE

      public static final int CONFIDENCE
      evaluation measure: rule confidence (larger is better)
      See Also:
    • CONF

      public static final int CONF
      evaluation measure: rule confidence (larger is better)
      See Also:
    • CONFDIFF

      public static final int CONFDIFF
      evaluation measure: absolute confidence difference to prior (larger is better)
      See Also:
    • LIFT

      public static final int LIFT
      evaluation measure: lift value (confidence divided by prior) (larger is better)
      See Also:
    • LIFTDIFF

      public static final int LIFTDIFF
      evaluation measure: difference of lift value to 1 (larger is better)
      See Also:
    • LIFTQUOT

      public static final int LIFTQUOT
      evaluation measure: difference of lift quotient to 1 (larger is better)
      See Also:
    • CONVICTION

      public static final int CONVICTION
      evaluation measure: conviction (larger is better)
      See Also:
    • CVCT

      public static final int CVCT
      evaluation measure: conviction (larger is better)
      See Also:
    • CVCTDIFF

      public static final int CVCTDIFF
      evaluation measure: difference of conviction to 1 (larger is better)
      See Also:
    • CVCTQUOT

      public static final int CVCTQUOT
      evaluation measure: difference of conviction quotient to 1 (larger is better)
      See Also:
    • CPROB

      public static final int CPROB
      evaluation measure: conditional probability ratio (larger is better)
      See Also:
    • CONDPROB

      public static final int CONDPROB
      evaluation measure: conditional probability ratio (larger is better)
      See Also:
    • IMPORTANCE

      public static final int IMPORTANCE
      evaluation measure: importance (larger is better)
      See Also:
    • IMPORT

      public static final int IMPORT
      evaluation measure: importance (larger is better)
      See Also:
    • CERTAINTY

      public static final int CERTAINTY
      evaluation measure: certainty factor (larger is better)
      See Also:
    • CERT

      public static final int CERT
      evaluation measure: certainty factor (larger is better)
      See Also:
    • CHI2

      public static final int CHI2
      evaluation measure: normalized chi^2 measure (larger is better)
      See Also:
    • CHI2PVAL

      public static final int CHI2PVAL
      evaluation measure: p-value from chi^2 measure (smaller is better)
      See Also:
    • YATES

      public static final int YATES
      evaluation measure: normalized chi^2 measure (Yates corrected) (larger is better)
      See Also:
    • YATESPVAL

      public static final int YATESPVAL
      evaluation measure: p-value from chi^2 measure (Yates corrected) (smaller is better)
      See Also:
    • INFO

      public static final int INFO
      evaluation measure: information difference to prior (larger is better)
      See Also:
    • INFOPVAL

      public static final int INFOPVAL
      evaluation measure: p-value from information difference (smaller is better)
      See Also:
    • FETPROB

      public static final int FETPROB
      evaluation measure: Fisher's exact test (table probability) (smaller is better)
      See Also:
    • FETCHI2

      public static final int FETCHI2
      evaluation measure: Fisher's exact test (chi^2 measure) (smaller is better)
      See Also:
    • FETINFO

      public static final int FETINFO
      evaluation measure: Fisher's exact test (information gain) (smaller is better)
      See Also:
    • FETSUPP

      public static final int FETSUPP
      evaluation measure: Fisher's exact test (support) (smaller is better)
      See Also:
    • LDRATIO

      public static final int LDRATIO
      evaluation measure: binary logarithm of support quotient (larger is better)
      See Also:
    • SIZESIM

      public static final int SIZESIM
      evaluation measure: item set size times cover similarity (larger is better) (only for JIM algorithm)
      See Also:
    • MIN

      public static final int MIN
      aggregation mode: minimum of individual measure values
      See Also:
    • MINIMUM

      public static final int MINIMUM
      aggregation mode: minimum of individual measure values
      See Also:
    • MAX

      public static final int MAX
      aggregation mode: maximum of individual measure values
      See Also:
    • MAXIMUM

      public static final int MAXIMUM
      aggregation mode: maximum of individual measure values
      See Also:
    • AVG

      public static final int AVG
      aggregation mode: average of individual measure values
      See Also:
    • AVERAGE

      public static final int AVERAGE
      aggregation mode: average of individual measure values
      See Also:
    • AUTO

      public static final int AUTO
      algorithm variant: automatic choice (always applicable)
      See Also:
    • APRI_BASIC

      public static final int APRI_BASIC
      Apriori variant: basic algorithm
      See Also:
    • ECLAT_BASIC

      public static final int ECLAT_BASIC
      Eclat variant: transaction id lists intersection (basic)
      See Also:
    • ECLAT_LISTS

      public static final int ECLAT_LISTS
      Eclat variant: transaction id lists intersection (improved)
      See Also:
    • ECLAT_TIDS

      public static final int ECLAT_TIDS
      Eclat variant: transaction id lists intersection (improved)
      See Also:
    • ECLAT_BITS

      public static final int ECLAT_BITS
      Eclat variant: transaction id lists as bit vectors
      See Also:
    • ECLAT_TABLE

      public static final int ECLAT_TABLE
      Eclat variant: item occurrence table (standard)
      See Also:
    • ECLAT_SIMPLE

      public static final int ECLAT_SIMPLE
      Eclat variant: item occurrence table (simplified)
      See Also:
    • ECLAT_RANGES

      public static final int ECLAT_RANGES
      Eclat variant: transaction id range lists intersection
      See Also:
    • ECLAT_OCCDLV

      public static final int ECLAT_OCCDLV
      Eclat variant: occurrence deliver from transaction lists
      See Also:
    • ECLAT_LCM

      public static final int ECLAT_LCM
      Eclat variant: occurrence deliver from transaction lists
      See Also:
    • ECLAT_DIFFS

      public static final int ECLAT_DIFFS
      Eclat variant: transaction id difference sets (diffsets)
      See Also:
    • FPG_SIMPLE

      public static final int FPG_SIMPLE
      FP-growth variant: simple tree nodes (link and parent)
      See Also:
    • FPG_COMPLEX

      public static final int FPG_COMPLEX
      FP-growth variant: complex tree nodes (children and sibling)
      See Also:
    • FPG_SINGLE

      public static final int FPG_SINGLE
      FP-growth variant: top-down processing on a single prefix tree
      See Also:
    • FPG_TOPDOWN

      public static final int FPG_TOPDOWN
      FP-growth variant: top-down processing of the prefix trees
      See Also:
    • SAM_BASIC

      public static final int SAM_BASIC
      SaM variant: basic split and merge algorithm
      See Also:
    • SAM_BSEARCH

      public static final int SAM_BSEARCH
      SaM variant: split and merge with binary search
      See Also:
    • SAM_DOUBLE

      public static final int SAM_DOUBLE
      SaM variant: split and merge with double source buffering
      See Also:
    • SAM_TREE

      public static final int SAM_TREE
      SaM variant: split and merge with transaction prefix tree
      See Also:
    • RELIM_BASIC

      public static final int RELIM_BASIC
      RElim variant: basic recursive elimination algorithm
      See Also:
    • JIM_BASIC

      public static final int JIM_BASIC
      JIM variant: basic algorithm
      See Also:
    • CARP_TABLE

      public static final int CARP_TABLE
      Carpenter variant: item occurrence counter table
      See Also:
    • CARP_LISTS

      public static final int CARP_LISTS
      Carpenter variant: transaction identifier lists
      See Also:
    • CARP_TIDLIST

      public static final int CARP_TIDLIST
      Carpenter variant: transaction identifier lists
      See Also:
    • CARP_TIDLISTS

      public static final int CARP_TIDLISTS
      Carpenter variant: transaction identifier lists
      See Also:
    • ISTA_PREFIX

      public static final int ISTA_PREFIX
      IsTa variant: standard prefix tree
      See Also:
    • ISTA_PATRICIA

      public static final int ISTA_PATRICIA
      IsTa variant: patricia tree (compact prefix tree)
      See Also:
    • NOFIM16

      public static final int NOFIM16
      processing mode: do not use a 16-items machine
      See Also:
    • NOPERFECT

      public static final int NOPERFECT
      processing mode: do not use perfect extension pruning
      See Also:
    • NOSORT

      public static final int NOSORT
      processing mode: do not sort items w.r.t. conditional support
      See Also:
    • NOHUT

      public static final int NOHUT
      processing mode: do not use head union tail (hut) pruning (for maximal item sets)
      See Also:
    • HORZ

      public static final int HORZ
      processing mode: check extensions for closed/maximal item sets with a horizontal scheme (default: use a repository)
      See Also:
    • VERT

      public static final int VERT
      processing mode: check extensions for closed/maximal item sets with a vertical scheme (default: use a repository)
      See Also:
    • INVBXS

      public static final int INVBXS
      processing mode: invalidate evaluation below expected support
      See Also:
    • ORIGSUPP

      public static final int ORIGSUPP
      processing mode: use original support definition for rules (body & head instead of only body)
      See Also:
    • NOTREE

      public static final int NOTREE
      processing mode: do not organize transactions as a prefix tree (for Apriori algorithm)
      See Also:
    • POSTPRUNE

      public static final int POSTPRUNE
      processing mode: a-posteriori pruning of infrequent item sets (for Apriori algorithm)
      See Also:
    • REPOFILT

      public static final int REPOFILT
      processing mode: filter maximal item sets with repository (for Carpenter algorithm)
      See Also:
    • ADDALL

      public static final int ADDALL
      processing mode: add all (closed) item sets to repository (for Carpenter algorithm)
      See Also:
    • NOCOLLATE

      public static final int NOCOLLATE
      processing mode: do not collate equal transactions (for Carpenter algorithm)
      See Also:
    • NOPRUNE

      public static final int NOPRUNE
      processing mode: do not prune the prefix/patricia tree (for IsTa algorithm)
      See Also:
    • JIM_NONE

      public static final int JIM_NONE
      JIM: no cover similarity
      See Also:
    • JIM_RUSSEL_RAO

      public static final int JIM_RUSSEL_RAO
      JIM: Russel-Rao S_R = s/n
      See Also:
    • JIM_KULCYNSKI

      public static final int JIM_KULCYNSKI
      JIM: Kulcynski S_K = s/q
      See Also:
    • JIM_JACCARD

      public static final int JIM_JACCARD
      JIM: Jaccard/Tanimoto S_J = s/r
      See Also:
    • JIM_TANIMOTO

      public static final int JIM_TANIMOTO
      JIM: Jaccard/Tanimoto S_J = s/r
      See Also:
    • JIM_DICE

      public static final int JIM_DICE
      JIM: Dice S_D = 2s/(r+s)
      See Also:
    • JIM_SORENSEN

      public static final int JIM_SORENSEN
      JIM: Sorensen S_D = 2s/(r+s)
      See Also:
    • JIM_CZEKANOWSKI

      public static final int JIM_CZEKANOWSKI
      JIM: Czekanowski S_D = 2s/(r+s)
      See Also:
    • JIM_SOKAL_SNEATH_1

      public static final int JIM_SOKAL_SNEATH_1
      JIM: Sokal--Sneath 1 S_S = s/(r+q)
      See Also:
    • JIM_SOKAL_MICHENER

      public static final int JIM_SOKAL_MICHENER
      JIM: Sokal--Michener S_M = (s+z)/n
      See Also:
    • JIM_HAMMING

      public static final int JIM_HAMMING
      JIM: Hamming S_M = (s+z)/n
      See Also:
    • JIM_FAITH

      public static final int JIM_FAITH
      JIM: Faith S_F = (s+z/2)/n
      See Also:
    • JIM_ROGERS_TANIMOTO

      public static final int JIM_ROGERS_TANIMOTO
      JIM: Rogers--Tanimoto S_T = (s+z)/(n+q)
      See Also:
    • JIM_SOKAL_SNEATH_2

      public static final int JIM_SOKAL_SNEATH_2
      JIM: Sokal--Sneath 2 S_N = 2(s+z)/(n+s+z)
      See Also:
    • JIM_GOWER_LEGENDRE

      public static final int JIM_GOWER_LEGENDRE
      JIM: Gower--Legendre S_N = 2(s+z)/(n+s+z)
      See Also:
    • JIM_SOKAL_SNEATH_3

      public static final int JIM_SOKAL_SNEATH_3
      JIM: Sokal--Sneath 3 S_O = (s+z)/q
      See Also:
    • JIM_BARONI_BUSER

      public static final int JIM_BARONI_BUSER
      JIM: Baroni--Buser S_B = (x+s)/(x+r)
      See Also:
    • JIM_GENERIC

      public static final int JIM_GENERIC
      JIM: generic measure S = (c_0s +c_1z +c_2n +c_3x) / (c_4s +c_5z +c_6n +c_7x)
      See Also:
    • IDENT

      public static final int IDENT
      surrogate method: identity (keep original data)
      See Also:
    • RANDOM

      public static final int RANDOM
      surrogate method: random transaction generation
      See Also:
    • SWAP

      public static final int SWAP
      surrogate method: permutation by pair swaps
      See Also:
    • SHUFFLE

      public static final int SHUFFLE
      surrogate method: shuffle table-derived data (columns)
      See Also:
    • COLUMNS

      public static final String COLUMNS
      pattern spectrum report format: three columns size (integer), support (integer) and (average) occurrence frequency (double)
      See Also:
    • OBJECTS

      public static final String OBJECTS
      pattern spectrum report format: objects of type PatSpecElem
      See Also:
  • Constructor Details

    • JNIFIM

      public JNIFIM()
      Constructor.
      Since:
      2023.07.30 (Christian Borgelt)
  • Method Details

    • fim

      public static Object[] fim(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int[] border)
      Java interface to frequent item set mining in C (very simplified interface).
      Parameters:
      tracts - array of transactions to analyze, each of which is an array of integers
      wgts - weights of the transactions (same array index)
      (may be null if the transactions do not carry weights; in this case each transaction receives a unit weight)
      target - type of the item sets to find (SETS, ALL, FREQUENT, CLOSED, MAXIMAL or GENERATORS)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      report - values to report with an item set
      (multiple values possible except for "#" and "=", which have to form a one-character string)
      "a": absolute item set support (number of transactions, integer)
      "s": relative item set support (fraction of transactions, double)
      "S": relative item set support (percentage of transactions, double)
      "#": pattern spectrum in column format
      "=": pattern spectrum with PatSpecElem
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      Returns:
      if report != "#" and report != "=": an array with k+1 elements, the first of which is an array that contains the item sets as integer arrays, while the following k elements contain arrays that correspond to the values selected with the characters in report (each value in these arrays corresponds to the item set at the same array index in the first array).
      if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
      if report = '=': an array with objects of type PatSpecElem, each of which specifies a pattern signature together with its occurrence frequency.
      Since:
      2014.09.26 (Christian Borgelt)
    • xfim

      public static Object[] xfim(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int eval, int agg, double thresh, int[] border)
      Java interface to frequent item set mining in C (less simplified interface).
      Parameters:
      tracts - array of transactions to analyze, each of which is an array of integers
      wgts - weights of the transactions (same array index)
      (may be null if the transactions do not carry weights; in this case each transaction receives a unit weight)
      target - type of the item sets to find (SETS, ALL, FREQUENT, CLOSED, MAXIMAL or GENERATORS)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      report - values to report with an item set
      (multiple values possible except for "#" and "=", which have to form a one-character string)
      "a": absolute item set support (number of transactions, integer)
      "s": relative item set support (fraction of transactions, double)
      "S": relative item set support (percentage of transactions, double)
      "e": value of item set evaluation measure (double)
      "E": value of item set evaluation measure as a percentage (double)
      "#": pattern spectrum in column format
      "=": pattern spectrum with PatSpecElem
      eval - measure for item set evaluation
      (NONE, LDRATIO, CONFIDENCE, CONF, CONFDIFF, LIFT, LIFTDIFF, LIFTQUOT, CONVICTION, CVCT, CVCTDIFF, CVCTQUOT, CPROB, CONDPROB, IMPORTANCE, IMPORT, CERTAINTY, CERT, CHI2, CHI2PVAL, YATES, YATESPVAL, INFO, INFOPVAL, FETPROB, FETCHI2, FETINFO, FETSUPP)
      agg - evaluation measure aggregation mode
      (NONE, MIN, MINIMUM, MAX, MAXIMUM, AVG, AVERAGE)
      thresh - threshold for evaluation measure (lower bound for measures for which larger is better, upper bound for measures for which smaller is better)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      Returns:
      if report != "#" and report != "=": an array with k+1 elements, the first of which is an array that contains the item sets as integer arrays, while the following k elements contain arrays that correspond to the values selected with the characters in report (each value in these arrays corresponds to the item set at the same array index in the first array).
      if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
      if report = '=': an array with objects of type PatSpecElem, each of which specifies a pattern signature together with its occurrence frequency.
      Since:
      2014.10.01 (Christian Borgelt)
    • arules

      public static Object[] arules(int[][] tracts, int[] wgts, double supp, double conf, int zmin, int zmax, String report, int eval, double thresh, int mode, int[][] appear)
      Java interface to association rule induction in C.
      Parameters:
      tracts - array of transactions to analyze, each of which is an array of integers
      wgts - weights of the transactions (same array index)
      (may be null if the transactions do not carry weights; in this case each transaction receives a unit weight)
      supp - minimum support of an association rule
      (positive: percentage, negative: absolute number)
      conf - minimum confidence of an association rule
      zmin - minimum number of items per association rule
      zmax - maximum number of items per association rule
      report - values to report with an item set
      (multiple values possible except for "#" and "=", which have to form a one-character string)
      "a": absolute item set support (number of transactions, integer)
      "s": relative item set support (fraction of transactions, double)
      "S": relative item set support (percentage of transactions, double)
      "b": absolute body set support (number of transactions, integer)
      "x": relative body set support as a fraction (double)
      "X": relative body set support as a percentage (double)
      "h": absolute head item support (number of transactions, integer)
      "y": relative head item support as a fraction (double)
      "Y": relative head item support as a percentage (double)
      "c": rule confidence as a fraction (double)
      "C": rule confidence as a percentage (double)
      "l": lift value of a rule (confidence/prior) (double)
      "L": lift value of a rule as a percentage (double)
      "e": value of rule evaluation measure (double)
      "E": value of rule evaluation measure (double) as a percentage
      "#": pattern spectrum in column format
      "=": pattern spectrum with PatSpecElem
      eval - measure for association rule evaluation
      (NONE, LDRATIO, CONFIDENCE, CONF, CONFDIFF, LIFT, LIFTDIFF, LIFTQUOT, CONVICTION, CVCT, CVCTDIFF, CVCTQUOT, CPROB, CONDPROB, IMPORTANCE, IMPORT, CERTAINTY, CERT, CHI2, CHI2PVAL, YATES, YATESPVAL, INFO, INFOPVAL, FETPROB, FETCHI2, FETINFO, FETSUPP)
      thresh - threshold for evaluation measure (lower bound for measures for which larger is better, upper bound for measures for which smaller is better)
      mode - operation mode indicators/flags
      (NONE or ORIGSUPP)
      appear - map from items to item appearance indicators as two integer arrays odf equal size, with the first holding the items, the second the corresponding item appearance indicators.
      This parameter may be null; and then items may appear anywhere in a rule.
      The item appearance indicators must be one of IGNORE, NEITHER, NONE, BODY, INPUT, ANTE, ANTECENDENT, HEAD, OUTPUT, CONS, CONSEQUENT, BOTH, INOUT, CANDA. The default appearance indicator is set via a pseudo-item which has a negative identifier.
      Returns:
      if report != "#" and report != "=": an array with k+2 elements, the first of which is an integer array that contains the head (consequent) items, while the second contains the body (antecedent) item sets as integer arrays; the following k array elements contain arrays that correspond to the values selected with the characters in report (each value in these arrays corresponds to the association rule at the same array index in the first and the second array).
      if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
      if report = '=': an array with objects of type PatSpecElem, each of which specifies a pattern signature together with its occurrence frequency.
      Since:
      2014.09.26 (Christian Borgelt)
    • apriori

      public static Object[] apriori(int[][] tracts, int[] wgts, int target, double supp, double conf, int zmin, int zmax, String report, int eval, int agg, double thresh, int prune, int algo, int mode, int[] border, int[][] appear)
      Java interface to Apriori algorithm in C.
      Parameters:
      tracts - array of transactions to analyze, each of which is an array of integers
      wgts - weights of the transactions (same array index)
      (may be null if the transactions do not carry weights; in this case each transaction receives a unit weight)
      target - type of the item sets to find (SETS, ALL, FREQUENT, CLOSED, MAXIMAL, GENERATORS or RULES)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      conf - minimum confidence of an association rule
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      report - values to report with an item set
      (multiple values possible except for "#" and "=", which have to form a one-character string)
      "a": absolute item set support (number of transactions, integer)
      "s": relative item set support (fraction of transactions, double)
      "S": relative item set support (percentage of transactions, double)
      "b": absolute body set support (number of transactions, integer)
      "x": relative body set support as a fraction (double)
      "X": relative body set support as a percentage (double)
      "h": absolute head item support (number of transactions, integer)
      "y": relative head item support as a fraction (double)
      "Y": relative head item support as a percentage (double)
      "c": rule confidence as a fraction (double)
      "C": rule confidence as a percentage (double)
      "l": lift value of a rule (confidence/prior) (double)
      "L": lift value of a rule as a percentage (double)
      "e": value of rule evaluation measure (double)
      "E": value of rule evaluation measure (double) as a percentage
      "#": pattern spectrum in column format
      "=": pattern spectrum with PatSpecElem
      eval - measure for item set evaluation
      (NONE, LDRATIO, CONFIDENCE, CONF, CONFDIFF, LIFT, LIFTDIFF, LIFTQUOT, CONVICTION, CVCT, CVCTDIFF, CVCTQUOT, CPROB, CONDPROB, IMPORTANCE, IMPORT, CERTAINTY, CERT, CHI2, CHI2PVAL, YATES, YATESPVAL, INFO, INFOPVAL, FETPROB, FETCHI2, FETINFO, FETSUPP)
      agg - evaluation measure aggregation mode
      (NONE, MIN, MINIMUM, MAX, MAXIMUM, AVG, AVERAGE)
      thresh - threshold for evaluation measure (lower bound for measures for which larger is better, upper bound for measures for which smaller is better)
      prune - minimum size for evaluation filtering
      = 0: backward filtering (no subset check)
      < 0: weak forward filtering (one subset must qualify)
      > 0: strong forward filtering (all subsets must qualify)
      algo - algorithm variant to use
      (AUTO or APRI_BASIC)
      mode - operation mode indicators/flags
      (NONE, NOPERFECT, NOTREE, POSTPRUNE, INVBXS, ORIGSUPP)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      appear - map from items to item appearance indicators as two integer arrays odf equal size, with the first holding the items, the second the corresponding item appearance indicators.
      This parameter is used only if target = RULES. It may be null; and then items may appear anywhere in a rule.
      The item appearance indicators must be one of IGNORE, NEITHER, NONE, BODY, INPUT, ANTE, ANTECENDENT, HEAD, OUTPUT, CONS, CONSEQUENT, BOTH, INOUT, CANDA. The default appearance indicator is set via a pseudo-item which has a negative identifier.
      Returns:
      if report != "#" and report != "=":
      if target = RULES: an array with k+2 elements, the first of which is an integer array that contains the head (consequent) items, while the second contains the body (antecedent) item sets as integer arrays; the following k array elements contain arrays that correspond to the values selected with the characters in report (each value in these arrays corresponds to the association rule at the same array index in the first and the second array).
      if target != RULES: an array with k+1 elements, the first of which is an array that contains the item sets as integer arrays, while the following k elements contain arrays that correspond to the values selected with the characters in report (each value in these arrays corresponds to the item set at the same array index in the first array).
      if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
      if report = '=': an array with objects of type PatSpecElem, each of which specifies a pattern signature together with its occurrence frequency.
      Since:
      2014.10.01 (Christian Borgelt)
    • eclat

      public static Object[] eclat(int[][] tracts, int[] wgts, int target, double conf, double supp, int zmin, int zmax, String report, int eval, int agg, double thresh, int prune, int algo, int mode, int[] border, int[][] appear)
      Java interface to Eclat algorithm in C.
      Parameters:
      tracts - array of transactions to analyze, each of which is an array of integers
      wgts - weights of the transactions (same array index)
      (may be null if the transactions do not carry weights; in this case each transaction receives a unit weight)
      target - type of the item sets to find (SETS, ALL, FREQUENT, CLOSED, MAXIMAL, GENERATORS or RULES)
      conf - minimum confidence of an association rule
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      report - values to report with an item set
      (multiple values possible except for "#" and "=", which have to form a one-character string)
      "a": absolute item set support (number of transactions, integer)
      "s": relative item set support (fraction of transactions, double)
      "S": relative item set support (percentage of transactions, double)
      "b": absolute body set support (number of transactions, integer)
      "x": relative body set support as a fraction (double)
      "X": relative body set support as a percentage (double)
      "h": absolute head item support (number of transactions, integer)
      "y": relative head item support as a fraction (double)
      "Y": relative head item support as a percentage (double)
      "c": rule confidence as a fraction (double)
      "C": rule confidence as a percentage (double)
      "l": lift value of a rule (confidence/prior) (double)
      "L": lift value of a rule as a percentage (double)
      "e": value of rule evaluation measure (double)
      "E": value of rule evaluation measure (double) as a percentage
      "#": pattern spectrum in column format
      "=": pattern spectrum with PatSpecElem
      eval - measure for item set evaluation
      (NONE, LDRATIO, CONFIDENCE, CONF, CONFDIFF, LIFT, LIFTDIFF, LIFTQUOT, CONVICTION, CVCT, CVCTDIFF, CVCTQUOT, CPROB, CONDPROB, IMPORTANCE, IMPORT, CERTAINTY, CERT, CHI2, CHI2PVAL, YATES, YATESPVAL, INFO, INFOPVAL, FETPROB, FETCHI2, FETINFO, FETSUPP)
      agg - evaluation measure aggregation mode
      (NONE, MIN, MINIMUM, MAX, MAXIMUM, AVG, AVERAGE)
      thresh - threshold for evaluation measure (lower bound for measures for which larger is better, upper bound for measures for which smaller is better)
      prune - minimum size for evaluation filtering
      = 0: backward filtering (no subset check)
      < 0: weak forward filtering (one subset must qualify)
      > 0: strong forward filtering (all subsets must qualify)
      algo - algorithm variant to use
      (AUTO, ECLAT_BASIC, ECLAT_TIDS, ECLAT_BITS, ECLAT_TABLE, ECLAT_SIMPLE, ECLAT_RANGES, ECLAT_OCCDLV, ECLAT_DIFFS)
      mode - operation mode indicators/flags
      (NONE, NOFIM16, NOPERFECT, NOSORT, NOHUT, HORZ, VERT, INVBXS, ORIGSUPP)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      appear - map from items to item appearance indicators as two integer arrays odf equal size, with the first holding the items, the second the corresponding item appearance indicators.
      This parameter is used only if target = RULES. It may be null; and then items may appear anywhere in a rule.
      The item appearance indicators must be one of IGNORE, NEITHER, NONE, BODY, INPUT, ANTE, ANTECENDENT, HEAD, OUTPUT, CONS, CONSEQUENT, BOTH, INOUT, CANDA. The default appearance indicator is set via a pseudo-item which has a negative identifier.
      Returns:
      if report != "#" and report != "=":
      if target = RULES: an array with k+2 elements, the first of which is an integer array that contains the head (consequent) items, while the second contains the body (antecedent) item sets as integer arrays; the following k array elements contain arrays that correspond to the values selected with the characters in report (each value in these arrays corresponds to the association rule at the same array index in the first and the second array).
      if target != RULES: an array with k+1 elements, the first of which is an array that contains the item sets as integer arrays, while the following k elements contain arrays that correspond to the values selected with the characters in report (each value in these arrays corresponds to the item set at the same array index in the first array).
      if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
      if report = '=': an array with objects of type PatSpecElem, each of which specifies a pattern signature together with its occurrence frequency.
      Since:
      2015.02.27 (Christian Borgelt)
    • fpgrowth

      public static Object[] fpgrowth(int[][] tracts, int[] wgts, int target, double supp, double conf, int zmin, int zmax, String report, int eval, int agg, double thresh, int prune, int algo, int mode, int[] border, int[][] appear)
      Java interface to FP-growth algorithm in C.
      Parameters:
      tracts - array of transactions to analyze, each of which is an array of integers
      wgts - weights of the transactions (same array index)
      (may be null if the transactions do not carry weights; in this case each transaction receives a unit weight)
      target - type of the item sets to find (SETS, ALL, FREQUENT, CLOSED, MAXIMAL, GENERATORS or RULES)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      conf - minimum confidence of an association rule
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      report - values to report with an item set
      (multiple values possible except for "#" and "=", which have to form a one-character string)
      "a": absolute item set support (number of transactions, integer)
      "s": relative item set support (fraction of transactions, double)
      "S": relative item set support (percentage of transactions, double)
      "b": absolute body set support (number of transactions, integer)
      "x": relative body set support as a fraction (double)
      "X": relative body set support as a percentage (double)
      "h": absolute head item support (number of transactions, integer)
      "y": relative head item support as a fraction (double)
      "Y": relative head item support as a percentage (double)
      "c": rule confidence as a fraction (double)
      "C": rule confidence as a percentage (double)
      "l": lift value of a rule (confidence/prior) (double)
      "L": lift value of a rule as a percentage (double)
      "e": value of rule evaluation measure (double)
      "E": value of rule evaluation measure (double) as a percentage
      "#": pattern spectrum in column format
      "=": pattern spectrum with PatSpecElem
      eval - measure for item set evaluation
      (NONE, LDRATIO, CONFIDENCE, CONF, CONFDIFF, LIFT, LIFTDIFF, LIFTQUOT, CONVICTION, CVCT, CVCTDIFF, CVCTQUOT, CPROB, CONDPROB, IMPORTANCE, IMPORT, CERTAINTY, CERT, CHI2, CHI2PVAL, YATES, YATESPVAL, INFO, INFOPVAL, FETPROB, FETCHI2, FETINFO, FETSUPP)
      agg - evaluation measure aggregation mode
      (NONE, MIN, MINIMUM, MAX, MAXIMUM, AVG, AVERAGE)
      thresh - threshold for evaluation measure (lower bound for measures for which larger is better, upper bound for measures for which smaller is better)
      prune - minimum size for evaluation filtering
      = 0: backward filtering (no subset check)
      < 0: weak forward filtering (one subset must qualify)
      > 0: strong forward filtering (all subsets must qualify)
      algo - algorithm variant to use
      (AUTO, FPG_SIMPLE, FPG_COMPLEX, FPG_SINGLE, FPG_TOPDOWN)
      mode - operation mode indicators/flags
      (NONE, NOFIM16, NOPERFECT, NOSORT, NOHUT, INVBXS, ORIGSUPP)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      appear - map from items to item appearance indicators as two integer arrays odf equal size, with the first holding the items, the second the corresponding item appearance indicators.
      This parameter is used only if target = RULES. It may be null; and then items may appear anywhere in a rule.
      The item appearance indicators must be one of IGNORE, NEITHER, NONE, BODY, INPUT, ANTE, ANTECENDENT, HEAD, OUTPUT, CONS, CONSEQUENT, BOTH, INOUT, CANDA. The default appearance indicator is set via a pseudo-item which has a negative identifier.
      Returns:
      if report != "#" and report != "=":
      if target = RULES: an array with k+2 elements, the first of which is an integer array that contains the head (consequent) items, while the second contains the body (antecedent) item sets as integer arrays; the following k array elements contain arrays that correspond to the values selected with the characters in report (each value in these arrays corresponds to the association rule at the same array index in the first and the second array).
      if target != RULES: an array with k+1 elements, the first of which is an array that contains the item sets as integer arrays, while the following k elements contain arrays that correspond to the values selected with the characters in report (each value in these arrays corresponds to the item set at the same array index in the first array).
      if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
      if report = '=': an array with objects of type PatSpecElem, each of which specifies a pattern signature together with its occurrence frequency.
      Since:
      2014.10.01 (Christian Borgelt)
    • sam

      public static Object[] sam(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int eval, double thresh, int algo, int mode, int[] border)
      Java interface to SaM algorithm in C.
      Parameters:
      tracts - array of transactions to analyze, each of which is an array of integers
      wgts - weights of the transactions (same array index)
      (may be null if the transactions do not carry weights; in this case each transaction receives a unit weight)
      target - type of the item sets to find (SETS, ALL, FREQUENT, CLOSED or MAXIMAL)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      report - values to report with an item set
      (multiple values possible except for "#" and "=", which have to form a one-character string)
      "a": absolute item set support (number of transactions, integer)
      "s": relative item set support (fraction of transactions, double)
      "S": relative item set support (percentage of transactions, double)
      "e": value of item set evaluation measure (double)
      "E": value of item set evaluation measure as a percentage (double)
      "#": pattern spectrum in column format
      "=": pattern spectrum with PatSpecElem
      eval - measure for item set evaluation
      (NONE, LDRATIO)
      thresh - threshold for evaluation measure
      algo - algorithm variant to use
      (AUTO, SAM_SIMPLE, SAM_BSEARCH, SAM_DOUBLE, SAM_TREE)
      mode - operation mode indicators/flags
      (NONE, NOFIM16, NOPERFECT)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      Returns:
      if report != "#" and report != "=": an array with k+1 elements, the first of which is an array that contains the item sets as integer arrays, while the following k elements contain arrays that correspond to the values selected with the characters in report (each value in these arrays corresponds to the item set at the same array index in the first array).
      if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
      if report = '=': an array with objects of type PatSpecElem, each of which specifies a pattern signature together with its occurrence frequency.
      Since:
      2014.10.01 (Christian Borgelt)
    • relim

      public static Object[] relim(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int eval, double thresh, int algo, int mode, int[] border)
      Java interface to RElim algorithm in C.
      Parameters:
      tracts - array of transactions to analyze, each of which is an array of integers
      wgts - weights of the transactions (same array index)
      (may be null if the transactions do not carry weights; in this case each transaction receives a unit weight)
      target - type of the item sets to find (SETS, ALL, FREQUENT, CLOSED or MAXIMAL)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      report - values to report with an item set
      (multiple values possible except for "#" and "=", which have to form a one-character string)
      "a": absolute item set support (number of transactions, integer)
      "s": relative item set support (fraction of transactions, double)
      "S": relative item set support (percentage of transactions, double)
      "e": value of item set evaluation measure (double)
      "E": value of item set evaluation measure as a percentage (double)
      "#": pattern spectrum in column format
      "=": pattern spectrum with PatSpecElem
      eval - measure for item set evaluation (NONE, LDRATIO)
      thresh - threshold for evaluation measure
      algo - algorithm variant to use
      (AUTO, RELIM_BASIC)
      mode - operation mode indicators/flags
      (NONE, NOFIM16, NOPERFECT)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      Returns:
      if report != "#" and report != "=": an array with k+1 elements, the first of which is an array that contains the item sets as integer arrays, while the following k elements contain arrays that correspond to the values selected with the characters in report (each value in these arrays corresponds to the item set at the same array index in the first array).
      if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
      if report = '=': an array with objects of type PatSpecElem, each of which specifies a pattern signature together with its occurrence frequency.
      Since:
      2014.10.01 (Christian Borgelt)
    • jim

      public static Object[] jim(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int eval, double thresh, int covsim, double[] simps, double sim, int algo, int mode, int[] border)
      Java interface to JIM algorithm in C.
      Parameters:
      tracts - array of transactions to analyze, each of which is an array of integers
      wgts - weights of the transactions (same array index)
      (may be null if the transactions do not carry weights; in this case each transaction receives a unit weight)
      target - type of the item sets to find (SETS, ALL, FREQUENT, CLOSED or MAXIMAL)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      report - values to report with an item set
      (multiple values possible except for "#" and "=", which have to form a one-character string)
      "a": absolute item set support (number of transactions, integer)
      "s": relative item set support (fraction of transactions, double)
      "S": relative item set support (percentage of transactions, double)
      "e": value of item set evaluation measure (double)
      "E": value of item set evaluation measure as a percentage (double)
      "#": pattern spectrum in column format
      "=": pattern spectrum with PatSpecElem
      eval - measure for item set evaluation
      (NONE, LDRATIO)
      thresh - threshold for evaluation measure
      covsim - cover similarity measure (JIM_NONE, JIM_RUSSEL_RAO, JIM_KULCZYNSKI, JIM_JACCARD, JIM_TANIMOTO, JIM_DICE, JIM_SORENSEN, JIM_CZEKANOWKSI, JIM_SOKAL_SNEATH_1, JIM_SOKAL_MICHENER, JIM_HAMMING, JIM_FAITH, JIM_ROGERS_TANIMOTO, JIM_SOKAL_SNEATH_2, JIM_GOWER_LEGENDRE, JIM_SOKAL_SNEATH_3, JIM_BARONI_BUSER, JIM_GENERIC)
      simps - cover similarity measure parameters (if generic) S = (c_0s +c_1z +c_2n +c_3x) / (c_4s +c_5z +c_6n +c_7x)
      sim - threshold for cover similarity measure
      algo - algorithm variant to use
      (AUTO, SAM_SIMPLE, SAM_BSEARCH, SAM_DOUBLE, SAM_TREE)
      mode - operation mode indicators/flags
      (NONE, NOFIM16, NOPERFECT)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      Returns:
      if report != "#" and report != "=": an array with k+1 elements, the first of which is an array that contains the item sets as integer arrays, while the following k elements contain arrays that correspond to the values selected with the characters in report (each value in these arrays corresponds to the item set at the same array index in the first array).
      if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
      if report = '=': an array with objects of type PatSpecElem, each of which specifies a pattern signature together with its occurrence frequency.
      Since:
      2018.03.21 (Christian Borgelt)
    • carpenter

      public static Object[] carpenter(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int eval, double thresh, int algo, int mode, int[] border)
      Java interface to Carpenter algorithm in C.
      Parameters:
      tracts - array of transactions to analyze, each of which is an array of integers
      wgts - weights of the transactions (same array index)
      (may be null if the transactions do not carry weights; in this case each transaction receives a unit weight)
      target - type of the item sets to find (CLOSED or MAXIMAL)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      report - values to report with an item set
      (multiple values possible except for "#" and "=", which have to form a one-character string)
      "a": absolute item set support (number of transactions, integer)
      "s": relative item set support (fraction of transactions, double)
      "S": relative item set support (percentage of transactions, double)
      "e": value of item set evaluation measure (double)
      "E": value of item set evaluation measure as a percentage (double)
      "#": pattern spectrum in column format
      "=": pattern spectrum with PatSpecElem
      eval - measure for item set evaluation (NONE, LDRATIO)
      thresh - threshold for evaluation measure
      algo - algorithm variant to use
      (AUTO, CARP_TABLE, CARP_TIDLIST)
      mode - operation mode indicators/flags
      (NONE, NOPERFECT, REPOFILT, MAXONLY, NOCOLLATE)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      Returns:
      if report != "#" and report != "=": an array with k+1 elements, the first of which is an array that contains the item sets as integer arrays, while the following k elements contain arrays that correspond to the values selected with the characters in report (each value in these arrays corresponds to the item set at the same array index in the first array).
      if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
      if report = '=': an array with objects of type PatSpecElem, each of which specifies a pattern signature together with its occurrence frequency.
      Since:
      2014.10.01 (Christian Borgelt)
    • ista

      public static Object[] ista(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int eval, double thresh, int algo, int mode, int[] border)
      Java interface to IsTa algorithm in C.
      Parameters:
      tracts - array of transactions to analyze, each of which is an array of integers
      wgts - weights of the transactions (same array index)
      (may be null if the transactions do not carry weights; in this case each transaction receives a unit weight)
      target - type of the item sets to find (CLOSED or MAXIMAL)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      report - values to report with an item set
      (multiple values possible except for "#" and "=", which have to form a one-character string)
      "a": absolute item set support (number of transactions, integer)
      "s": relative item set support (fraction of transactions, double)
      "S": relative item set support (percentage of transactions, double)
      "e": value of item set evaluation measure (double)
      "E": value of item set evaluation measure as a percentage (double)
      "#": pattern spectrum in column format
      "=": pattern spectrum with PatSpecElem
      eval - measure for item set evaluation (NONE, LDRATIO)
      thresh - threshold for evaluation measure
      algo - algorithm variant to use
      (AUTO, ISTA_PREFIX, ISTA_PATRICIA)
      mode - operation mode indicators/flags
      (NONE, NOPRUNE, REPOFILT)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      Returns:
      if report != "#" and report != "=": an array with k+1 elements, the first of which is an array that contains the item sets as integer arrays, while the following k elements contain arrays that correspond to the values selected with the characters in report (each value in these arrays corresponds to the item set at the same array index in the first array).
      if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
      if report = '=': an array with objects of type PatSpecElem, each of which specifies a pattern signature together with its occurrence frequency.
      Since:
      2014.10.01 (Christian Borgelt)
    • apriacc

      public static Object[] apriacc(int[][] tracts, int[] wgts, double supp, int zmin, int zmax, String report, int stat, double siglvl, int prune, int mode, int[] border)
      Java interface to accretion-style Apriori algorithm in C.
      Parameters:
      tracts - array of transactions to analyze, each of which is an array of integers
      wgts - weights of the transactions (same array index)
      (may be null if the transactions do not carry weights; in this case each transaction receives a unit weight)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      report - values to report with an item set
      (multiple values possible except for "#" and "=", which have to form a one-character string)
      "a": absolute item set support (number of transactions, integer)
      "s": relative item set support (fraction of transactions, double)
      "S": relative item set support (percentage of transactions, double)
      "p": p-value of item set test as a fraction (double)
      "P": p-value of item set test as a percentage (double)
      "#": pattern spectrum in column format
      "=": pattern spectrum with PatSpecElem
      stat - test statistic for item set evaluation (NONE, CHI2PVAL, YATESPVAL, INFOPVAL, FETPROB, FETCHI2, FETINFO, FETSUPP)
      siglvl - significance level (maximum p-value)
      prune - minimum size for evaluation filtering
      = 0: backward filtering (no subset check)
      < 0: weak forward filtering (one subset must qualify)
      > 0: strong forward filtering (all subsets must qualify)
      mode - operation mode indicators/flags
      (NONE, INVBXS)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      Returns:
      if report != "#" and report != "=": an array with k+1 elements, the first of which is an array that contains the item sets as integer arrays, while the following k elements contain arrays that correspond to the values selected with the characters in report (each value in these arrays corresponds to the item set at the same array index in the first array).
      if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
      if report = '=': an array with objects of type PatSpecElem, each of which specifies a pattern signature together with its occurrence frequency.
      Since:
      2014.10.01 (Christian Borgelt)
    • accretion

      public static Object[] accretion(int[][] tracts, int[] wgts, double supp, int zmin, int zmax, String report, int stat, double siglvl, int maxext, int mode, int[] border)
      Java interface to Accretion algorithm in C.
      Parameters:
      tracts - array of transactions to analyze, each of which is an array of integers
      wgts - weights of the transactions (same array index)
      (may be null if the transactions do not carry weights; in this case each transaction receives a unit weight)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      report - values to report with an item set
      (multiple values possible except for "#" and "=", which have to form a one-character string)
      "a": absolute item set support (number of transactions, integer)
      "s": relative item set support (fraction of transactions, double)
      "S": relative item set support (percentage of transactions, double)
      "p": p-value of item set test as a fraction (double)
      "P": p-value of item set test as a percentage (double)
      "#": pattern spectrum in column format
      "=": pattern spectrum with PatSpecElem
      stat - test statistic for item set evaluation (NONE, CHI2PVAL, YATESPVAL, INFOPVAL, FETPROB, FETCHI2, FETINFO, FETSUPP)
      siglvl - significance level (maximum p-value)
      maxext - maximum number of extension items
      mode - operation mode indicators/flags
      (NONE, INVBXS)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      Returns:
      if report != "#" and report != "=": an array with k+1 elements, the first of which is an array that contains the item sets as integer arrays, while the following k elements contain arrays that correspond to the values selected with the characters in report (each value in these arrays corresponds to the item set at the same array index in the first array).
      if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
      if report = '=': an array with objects of type PatSpecElem, each of which specifies a pattern signature together with its occurrence frequency.
      Since:
      2014.10.01 (Christian Borgelt)
    • genpsp

      public static Object[] genpsp(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int cnt, int surr, int seed, int cpus, int[] ctrl)
      Pattern spectrum generation with surrogate data sets.
      Parameters:
      tracts - array of transactions to process, each of which is an array of integers
      wgts - weights of the transactions (same array index)
      (may be null if the transactions do not carry weights; in this case each transaction receives a unit weight)
      target - type of the item sets to find (SETS, ALL or FREQUENT)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      report - format in which to report the pattern spectrum
      "#": pattern spectrum in column format
      "=": pattern spectrum with PatSpecElem
      cnt - number of surrogate data sets to generate
      surr - surrogate data generation method (IDENT, RANDOM, SWAP or SHUFFLE)
      seed - seed value for random number generator
      cpus - number of cpus/threads to use
      ctrl - control array (progress indicator, stop flag)
      Returns:
      if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
      if report = '=': an array with objects of type PatSpecElem, each of which specifies a pattern signature together with its occurrence frequency.
      Since:
      2014.09.26 (Christian Borgelt)
    • estpsp

      public static Object[] estpsp(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int equiv, double alpha, int smpls, int seed)
      Estimate a pattern spectrum from data characteristics.
      Parameters:
      tracts - array of transactions to analyze, each of which is an array of integers
      wgts - weights of the transactions (same array index)
      (may be null if the transactions do not carry weights; in this case each transaction receives a unit weight)
      target - type of the item sets to find (SETS, ALL or FREQUENT)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      report - format in which to report the pattern spectrum
      "#": pattern spectrum in column format
      "=": pattern spectrum with PatSpecElem
      equiv - equivalent number of surrogate data sets
      alpha - probability dispersion factor
      smpls - number of samples per item set size
      seed - seed value for random number generator
      Returns:
      if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
      if report = '=': an array with objects of type PatSpecElem, each of which specifies a pattern signature together with its occurrence frequency.
      Since:
      2014.09.26 (Christian Borgelt)
    • abort

      public static void abort(int state)
      Set the abort state (abort computations or clear abort state).
      Parameters:
      state - abort state to set (0: clear; != 0: signal abort)
      Since:
      2015.03.05 (Christian Borgelt)