Package fim
Class JNIFIM
java.lang.Object
fim.JNIFIM
Class for Java interface to frequent item set mining in C
- Since:
- 2014.09.26
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final int
processing mode: add all (closed) item sets to repository (for Carpenter algorithm)static final int
target type: all frequent item setsstatic final int
item may appear only in a rule body/antecedentstatic final int
item may appear only in a rule body/antecedentstatic final int
Apriori variant: basic algorithmstatic final int
algorithm variant: automatic choice (always applicable)static final int
aggregation mode: average of individual measure valuesstatic final int
aggregation mode: average of individual measure valuesstatic final int
item may appear only in a rule body/antecedentstatic final int
item may appear anywhere in a rulestatic final int
item may appear anywhere in a rulestatic final int
Carpenter variant: transaction identifier listsstatic final int
Carpenter variant: item occurrence counter tablestatic final int
Carpenter variant: transaction identifier listsstatic final int
Carpenter variant: transaction identifier listsstatic final int
evaluation measure: certainty factor (larger is better)static final int
evaluation measure: certainty factor (larger is better)static final int
evaluation measure: normalized chi^2 measure (larger is better)static final int
evaluation measure: p-value from chi^2 measure (smaller is better)static final int
target type: closed (frequent) item setsstatic final String
pattern spectrum report format: three columns size (integer), support (integer) and (average) occurrence frequency (double)static final int
evaluation measure: conditional probability ratio (larger is better)static final int
evaluation measure: rule confidence (larger is better)static final int
evaluation measure: absolute confidence difference to prior (larger is better)static final int
evaluation measure: rule confidence (larger is better)static final int
item may appear only in a rule head/consequentstatic final int
item may appear only in a rule head/consequentstatic final int
evaluation measure: conviction (larger is better)static final int
evaluation measure: conditional probability ratio (larger is better)static final int
evaluation measure: conviction (larger is better)static final int
evaluation measure: difference of conviction to 1 (larger is better)static final int
evaluation measure: difference of conviction quotient to 1 (larger is better)static final int
Eclat variant: transaction id lists intersection (basic)static final int
Eclat variant: transaction id lists as bit vectorsstatic final int
Eclat variant: transaction id difference sets (diffsets)static final int
Eclat variant: occurrence deliver from transaction listsstatic final int
Eclat variant: transaction id lists intersection (improved)static final int
Eclat variant: occurrence deliver from transaction listsstatic final int
Eclat variant: transaction id range lists intersectionstatic final int
Eclat variant: item occurrence table (simplified)static final int
Eclat variant: item occurrence table (standard)static final int
Eclat variant: transaction id lists intersection (improved)static final int
evaluation measure: Fisher's exact test (chi^2 measure) (smaller is better)static final int
evaluation measure: Fisher's exact test (information gain) (smaller is better)static final int
evaluation measure: Fisher's exact test (table probability) (smaller is better)static final int
evaluation measure: Fisher's exact test (support) (smaller is better)static final int
FP-growth variant: complex tree nodes (children and sibling)static final int
FP-growth variant: simple tree nodes (link and parent)static final int
FP-growth variant: top-down processing on a single prefix treestatic final int
FP-growth variant: top-down processing of the prefix treesstatic final int
target type: all frequent item setsstatic final int
target type: generator (frequent) item setsstatic final int
target type: generator (frequent) item setsstatic final int
item may appear only in a rule head/consequentstatic final int
processing mode: check extensions for closed/maximal item sets with a horizontal scheme (default: use a repository)static final int
surrogate method: identity (keep original data)static final int
item may not appear anywhere in a rulestatic final int
evaluation measure: importance (larger is better)static final int
evaluation measure: importance (larger is better)static final int
evaluation measure: information difference to prior (larger is better)static final int
evaluation measure: p-value from information difference (smaller is better)static final int
item may appear anywhere in a rulestatic final int
item may appear only in a rule body/antecedentstatic final int
processing mode: invalidate evaluation below expected supportstatic final int
IsTa variant: patricia tree (compact prefix tree)static final int
IsTa variant: standard prefix treestatic final int
JIM: Baroni--Buser S_B = (x+s)/(x+r)static final int
JIM variant: basic algorithmstatic final int
JIM: Czekanowski S_D = 2s/(r+s)static final int
JIM: Dice S_D = 2s/(r+s)static final int
JIM: Faith S_F = (s+z/2)/nstatic final int
JIM: generic measure S = (c_0s +c_1z +c_2n +c_3x) / (c_4s +c_5z +c_6n +c_7x)static final int
JIM: Gower--Legendre S_N = 2(s+z)/(n+s+z)static final int
JIM: Hamming S_M = (s+z)/nstatic final int
JIM: Jaccard/Tanimoto S_J = s/rstatic final int
JIM: Kulcynski S_K = s/qstatic final int
JIM: no cover similaritystatic final int
JIM: Rogers--Tanimoto S_T = (s+z)/(n+q)static final int
JIM: Russel-Rao S_R = s/nstatic final int
JIM: Sokal--Michener S_M = (s+z)/nstatic final int
JIM: Sokal--Sneath 1 S_S = s/(r+q)static final int
JIM: Sokal--Sneath 2 S_N = 2(s+z)/(n+s+z)static final int
JIM: Sokal--Sneath 3 S_O = (s+z)/qstatic final int
JIM: Sorensen S_D = 2s/(r+s)static final int
JIM: Jaccard/Tanimoto S_J = s/rstatic final int
evaluation measure: binary logarithm of support quotient (larger is better)static final int
evaluation measure: lift value (confidence divided by prior) (larger is better)static final int
evaluation measure: difference of lift value to 1 (larger is better)static final int
evaluation measure: difference of lift quotient to 1 (larger is better)static final int
aggregation mode: maximum of individual measure valuesstatic final int
target type: maximal (frequent) item setsstatic final int
aggregation mode: maximum of individual measure valuesstatic final int
aggregation mode: minimum of individual measure valuesstatic final int
aggregation mode: minimum of individual measure valuesstatic final int
item may not appear anywhere in a rulestatic final int
processing mode: do not collate equal transactions (for Carpenter algorithm)static final int
processing mode: do not use a 16-items machinestatic final int
processing mode: do not use head union tail (hut) pruning (for maximal item sets)static final int
evaluation measure/aggregation mode: nonestatic final int
processing mode: do not use perfect extension pruningstatic final int
processing mode: do not prune the prefix/patricia tree (for IsTa algorithm)static final int
processing mode: do not sort items w.r.t.static final int
processing mode: do not organize transactions as a prefix tree (for Apriori algorithm)static final String
pattern spectrum report format: objects of typePatSpecElem
static final int
processing mode: use original support definition for rules (body & head instead of only body)static final int
item may appear only in a rule head/consequentstatic final int
processing mode: a-posteriori pruning of infrequent item sets (for Apriori algorithm)static final int
surrogate method: random transaction generationstatic final int
RElim variant: basic recursive elimination algorithmstatic final int
processing mode: filter maximal item sets with repository (for Carpenter algorithm)static final int
target type: association rulesstatic final int
SaM variant: basic split and merge algorithmstatic final int
SaM variant: split and merge with binary searchstatic final int
SaM variant: split and merge with double source bufferingstatic final int
SaM variant: split and merge with transaction prefix treestatic final int
target type: all frequent item setsstatic final int
surrogate method: shuffle table-derived data (columns)static final int
evaluation measure: item set size times cover similarity (larger is better) (only for JIM algorithm)static final int
evaluation measure: rule support (larger is better)static final int
evaluation measure: rule support (larger is better)static final int
surrogate method: permutation by pair swapsstatic final String
the version stringstatic final int
processing mode: check extensions for closed/maximal item sets with a vertical scheme (default: use a repository)static final int
evaluation measure: normalized chi^2 measure (Yates corrected) (larger is better)static final int
evaluation measure: p-value from chi^2 measure (Yates corrected) (smaller is better) -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic void
abort
(int state) Set the abort state (abort computations or clear abort state).static Object[]
accretion
(int[][] tracts, int[] wgts, double supp, int zmin, int zmax, String report, int stat, double siglvl, int maxext, int mode, int[] border) Java interface to Accretion algorithm in C.static Object[]
apriacc
(int[][] tracts, int[] wgts, double supp, int zmin, int zmax, String report, int stat, double siglvl, int prune, int mode, int[] border) Java interface to accretion-style Apriori algorithm in C.static Object[]
apriori
(int[][] tracts, int[] wgts, int target, double supp, double conf, int zmin, int zmax, String report, int eval, int agg, double thresh, int prune, int algo, int mode, int[] border, int[][] appear) Java interface to Apriori algorithm in C.static Object[]
arules
(int[][] tracts, int[] wgts, double supp, double conf, int zmin, int zmax, String report, int eval, double thresh, int mode, int[][] appear) Java interface to association rule induction in C.static Object[]
carpenter
(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int eval, double thresh, int algo, int mode, int[] border) Java interface to Carpenter algorithm in C.static Object[]
eclat
(int[][] tracts, int[] wgts, int target, double conf, double supp, int zmin, int zmax, String report, int eval, int agg, double thresh, int prune, int algo, int mode, int[] border, int[][] appear) Java interface to Eclat algorithm in C.static Object[]
estpsp
(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int equiv, double alpha, int smpls, int seed) Estimate a pattern spectrum from data characteristics.static Object[]
fim
(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int[] border) Java interface to frequent item set mining in C (very simplified interface).static Object[]
fpgrowth
(int[][] tracts, int[] wgts, int target, double supp, double conf, int zmin, int zmax, String report, int eval, int agg, double thresh, int prune, int algo, int mode, int[] border, int[][] appear) Java interface to FP-growth algorithm in C.static Object[]
genpsp
(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int cnt, int surr, int seed, int cpus, int[] ctrl) Pattern spectrum generation with surrogate data sets.static Object[]
ista
(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int eval, double thresh, int algo, int mode, int[] border) Java interface to IsTa algorithm in C.static Object[]
jim
(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int eval, double thresh, int covsim, double[] simps, double sim, int algo, int mode, int[] border) Java interface to JIM algorithm in C.static Object[]
relim
(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int eval, double thresh, int algo, int mode, int[] border) Java interface to RElim algorithm in C.static Object[]
sam
(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int eval, double thresh, int algo, int mode, int[] border) Java interface to SaM algorithm in C.static Object[]
xfim
(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int eval, int agg, double thresh, int[] border) Java interface to frequent item set mining in C (less simplified interface).
-
Field Details
-
VERSION
the version string- See Also:
-
SETS
public static final int SETStarget type: all frequent item sets- See Also:
-
ALL
public static final int ALLtarget type: all frequent item sets- See Also:
-
FREQUENT
public static final int FREQUENTtarget type: all frequent item sets- See Also:
-
CLOSED
public static final int CLOSEDtarget type: closed (frequent) item sets- See Also:
-
MAXIMAL
public static final int MAXIMALtarget type: maximal (frequent) item sets- See Also:
-
GENERATORS
public static final int GENERATORStarget type: generator (frequent) item sets- See Also:
-
GENERAS
public static final int GENERAStarget type: generator (frequent) item sets- See Also:
-
RULES
public static final int RULEStarget type: association rules- See Also:
-
IGNORE
public static final int IGNOREitem may not appear anywhere in a rule- See Also:
-
NEITHER
public static final int NEITHERitem may not appear anywhere in a rule- See Also:
-
INPUT
public static final int INPUTitem may appear only in a rule body/antecedent- See Also:
-
BODY
public static final int BODYitem may appear only in a rule body/antecedent- See Also:
-
ANTE
public static final int ANTEitem may appear only in a rule body/antecedent- See Also:
-
ANTECEDENT
public static final int ANTECEDENTitem may appear only in a rule body/antecedent- See Also:
-
OUTPUT
public static final int OUTPUTitem may appear only in a rule head/consequent- See Also:
-
HEAD
public static final int HEADitem may appear only in a rule head/consequent- See Also:
-
CONS
public static final int CONSitem may appear only in a rule head/consequent- See Also:
-
CONSEQUENT
public static final int CONSEQUENTitem may appear only in a rule head/consequent- See Also:
-
BOTH
public static final int BOTHitem may appear anywhere in a rule- See Also:
-
INOUT
public static final int INOUTitem may appear anywhere in a rule- See Also:
-
CANDA
public static final int CANDAitem may appear anywhere in a rule- See Also:
-
NONE
public static final int NONEevaluation measure/aggregation mode: none- See Also:
-
SUPPORT
public static final int SUPPORTevaluation measure: rule support (larger is better)- See Also:
-
SUPP
public static final int SUPPevaluation measure: rule support (larger is better)- See Also:
-
CONFIDENCE
public static final int CONFIDENCEevaluation measure: rule confidence (larger is better)- See Also:
-
CONF
public static final int CONFevaluation measure: rule confidence (larger is better)- See Also:
-
CONFDIFF
public static final int CONFDIFFevaluation measure: absolute confidence difference to prior (larger is better)- See Also:
-
LIFT
public static final int LIFTevaluation measure: lift value (confidence divided by prior) (larger is better)- See Also:
-
LIFTDIFF
public static final int LIFTDIFFevaluation measure: difference of lift value to 1 (larger is better)- See Also:
-
LIFTQUOT
public static final int LIFTQUOTevaluation measure: difference of lift quotient to 1 (larger is better)- See Also:
-
CONVICTION
public static final int CONVICTIONevaluation measure: conviction (larger is better)- See Also:
-
CVCT
public static final int CVCTevaluation measure: conviction (larger is better)- See Also:
-
CVCTDIFF
public static final int CVCTDIFFevaluation measure: difference of conviction to 1 (larger is better)- See Also:
-
CVCTQUOT
public static final int CVCTQUOTevaluation measure: difference of conviction quotient to 1 (larger is better)- See Also:
-
CPROB
public static final int CPROBevaluation measure: conditional probability ratio (larger is better)- See Also:
-
CONDPROB
public static final int CONDPROBevaluation measure: conditional probability ratio (larger is better)- See Also:
-
IMPORTANCE
public static final int IMPORTANCEevaluation measure: importance (larger is better)- See Also:
-
IMPORT
public static final int IMPORTevaluation measure: importance (larger is better)- See Also:
-
CERTAINTY
public static final int CERTAINTYevaluation measure: certainty factor (larger is better)- See Also:
-
CERT
public static final int CERTevaluation measure: certainty factor (larger is better)- See Also:
-
CHI2
public static final int CHI2evaluation measure: normalized chi^2 measure (larger is better)- See Also:
-
CHI2PVAL
public static final int CHI2PVALevaluation measure: p-value from chi^2 measure (smaller is better)- See Also:
-
YATES
public static final int YATESevaluation measure: normalized chi^2 measure (Yates corrected) (larger is better)- See Also:
-
YATESPVAL
public static final int YATESPVALevaluation measure: p-value from chi^2 measure (Yates corrected) (smaller is better)- See Also:
-
INFO
public static final int INFOevaluation measure: information difference to prior (larger is better)- See Also:
-
INFOPVAL
public static final int INFOPVALevaluation measure: p-value from information difference (smaller is better)- See Also:
-
FETPROB
public static final int FETPROBevaluation measure: Fisher's exact test (table probability) (smaller is better)- See Also:
-
FETCHI2
public static final int FETCHI2evaluation measure: Fisher's exact test (chi^2 measure) (smaller is better)- See Also:
-
FETINFO
public static final int FETINFOevaluation measure: Fisher's exact test (information gain) (smaller is better)- See Also:
-
FETSUPP
public static final int FETSUPPevaluation measure: Fisher's exact test (support) (smaller is better)- See Also:
-
LDRATIO
public static final int LDRATIOevaluation measure: binary logarithm of support quotient (larger is better)- See Also:
-
SIZESIM
public static final int SIZESIMevaluation measure: item set size times cover similarity (larger is better) (only for JIM algorithm)- See Also:
-
MIN
public static final int MINaggregation mode: minimum of individual measure values- See Also:
-
MINIMUM
public static final int MINIMUMaggregation mode: minimum of individual measure values- See Also:
-
MAX
public static final int MAXaggregation mode: maximum of individual measure values- See Also:
-
MAXIMUM
public static final int MAXIMUMaggregation mode: maximum of individual measure values- See Also:
-
AVG
public static final int AVGaggregation mode: average of individual measure values- See Also:
-
AVERAGE
public static final int AVERAGEaggregation mode: average of individual measure values- See Also:
-
AUTO
public static final int AUTOalgorithm variant: automatic choice (always applicable)- See Also:
-
APRI_BASIC
public static final int APRI_BASICApriori variant: basic algorithm- See Also:
-
ECLAT_BASIC
public static final int ECLAT_BASICEclat variant: transaction id lists intersection (basic)- See Also:
-
ECLAT_LISTS
public static final int ECLAT_LISTSEclat variant: transaction id lists intersection (improved)- See Also:
-
ECLAT_TIDS
public static final int ECLAT_TIDSEclat variant: transaction id lists intersection (improved)- See Also:
-
ECLAT_BITS
public static final int ECLAT_BITSEclat variant: transaction id lists as bit vectors- See Also:
-
ECLAT_TABLE
public static final int ECLAT_TABLEEclat variant: item occurrence table (standard)- See Also:
-
ECLAT_SIMPLE
public static final int ECLAT_SIMPLEEclat variant: item occurrence table (simplified)- See Also:
-
ECLAT_RANGES
public static final int ECLAT_RANGESEclat variant: transaction id range lists intersection- See Also:
-
ECLAT_OCCDLV
public static final int ECLAT_OCCDLVEclat variant: occurrence deliver from transaction lists- See Also:
-
ECLAT_LCM
public static final int ECLAT_LCMEclat variant: occurrence deliver from transaction lists- See Also:
-
ECLAT_DIFFS
public static final int ECLAT_DIFFSEclat variant: transaction id difference sets (diffsets)- See Also:
-
FPG_SIMPLE
public static final int FPG_SIMPLEFP-growth variant: simple tree nodes (link and parent)- See Also:
-
FPG_COMPLEX
public static final int FPG_COMPLEXFP-growth variant: complex tree nodes (children and sibling)- See Also:
-
FPG_SINGLE
public static final int FPG_SINGLEFP-growth variant: top-down processing on a single prefix tree- See Also:
-
FPG_TOPDOWN
public static final int FPG_TOPDOWNFP-growth variant: top-down processing of the prefix trees- See Also:
-
SAM_BASIC
public static final int SAM_BASICSaM variant: basic split and merge algorithm- See Also:
-
SAM_BSEARCH
public static final int SAM_BSEARCHSaM variant: split and merge with binary search- See Also:
-
SAM_DOUBLE
public static final int SAM_DOUBLESaM variant: split and merge with double source buffering- See Also:
-
SAM_TREE
public static final int SAM_TREESaM variant: split and merge with transaction prefix tree- See Also:
-
RELIM_BASIC
public static final int RELIM_BASICRElim variant: basic recursive elimination algorithm- See Also:
-
JIM_BASIC
public static final int JIM_BASICJIM variant: basic algorithm- See Also:
-
CARP_TABLE
public static final int CARP_TABLECarpenter variant: item occurrence counter table- See Also:
-
CARP_LISTS
public static final int CARP_LISTSCarpenter variant: transaction identifier lists- See Also:
-
CARP_TIDLIST
public static final int CARP_TIDLISTCarpenter variant: transaction identifier lists- See Also:
-
CARP_TIDLISTS
public static final int CARP_TIDLISTSCarpenter variant: transaction identifier lists- See Also:
-
ISTA_PREFIX
public static final int ISTA_PREFIXIsTa variant: standard prefix tree- See Also:
-
ISTA_PATRICIA
public static final int ISTA_PATRICIAIsTa variant: patricia tree (compact prefix tree)- See Also:
-
NOFIM16
public static final int NOFIM16processing mode: do not use a 16-items machine- See Also:
-
NOPERFECT
public static final int NOPERFECTprocessing mode: do not use perfect extension pruning- See Also:
-
NOSORT
public static final int NOSORTprocessing mode: do not sort items w.r.t. conditional support- See Also:
-
NOHUT
public static final int NOHUTprocessing mode: do not use head union tail (hut) pruning (for maximal item sets)- See Also:
-
HORZ
public static final int HORZprocessing mode: check extensions for closed/maximal item sets with a horizontal scheme (default: use a repository)- See Also:
-
VERT
public static final int VERTprocessing mode: check extensions for closed/maximal item sets with a vertical scheme (default: use a repository)- See Also:
-
INVBXS
public static final int INVBXSprocessing mode: invalidate evaluation below expected support- See Also:
-
ORIGSUPP
public static final int ORIGSUPPprocessing mode: use original support definition for rules (body & head instead of only body)- See Also:
-
NOTREE
public static final int NOTREEprocessing mode: do not organize transactions as a prefix tree (for Apriori algorithm)- See Also:
-
POSTPRUNE
public static final int POSTPRUNEprocessing mode: a-posteriori pruning of infrequent item sets (for Apriori algorithm)- See Also:
-
REPOFILT
public static final int REPOFILTprocessing mode: filter maximal item sets with repository (for Carpenter algorithm)- See Also:
-
ADDALL
public static final int ADDALLprocessing mode: add all (closed) item sets to repository (for Carpenter algorithm)- See Also:
-
NOCOLLATE
public static final int NOCOLLATEprocessing mode: do not collate equal transactions (for Carpenter algorithm)- See Also:
-
NOPRUNE
public static final int NOPRUNEprocessing mode: do not prune the prefix/patricia tree (for IsTa algorithm)- See Also:
-
JIM_NONE
public static final int JIM_NONEJIM: no cover similarity- See Also:
-
JIM_RUSSEL_RAO
public static final int JIM_RUSSEL_RAOJIM: Russel-Rao S_R = s/n- See Also:
-
JIM_KULCYNSKI
public static final int JIM_KULCYNSKIJIM: Kulcynski S_K = s/q- See Also:
-
JIM_JACCARD
public static final int JIM_JACCARDJIM: Jaccard/Tanimoto S_J = s/r- See Also:
-
JIM_TANIMOTO
public static final int JIM_TANIMOTOJIM: Jaccard/Tanimoto S_J = s/r- See Also:
-
JIM_DICE
public static final int JIM_DICEJIM: Dice S_D = 2s/(r+s)- See Also:
-
JIM_SORENSEN
public static final int JIM_SORENSENJIM: Sorensen S_D = 2s/(r+s)- See Also:
-
JIM_CZEKANOWSKI
public static final int JIM_CZEKANOWSKIJIM: Czekanowski S_D = 2s/(r+s)- See Also:
-
JIM_SOKAL_SNEATH_1
public static final int JIM_SOKAL_SNEATH_1JIM: Sokal--Sneath 1 S_S = s/(r+q)- See Also:
-
JIM_SOKAL_MICHENER
public static final int JIM_SOKAL_MICHENERJIM: Sokal--Michener S_M = (s+z)/n- See Also:
-
JIM_HAMMING
public static final int JIM_HAMMINGJIM: Hamming S_M = (s+z)/n- See Also:
-
JIM_FAITH
public static final int JIM_FAITHJIM: Faith S_F = (s+z/2)/n- See Also:
-
JIM_ROGERS_TANIMOTO
public static final int JIM_ROGERS_TANIMOTOJIM: Rogers--Tanimoto S_T = (s+z)/(n+q)- See Also:
-
JIM_SOKAL_SNEATH_2
public static final int JIM_SOKAL_SNEATH_2JIM: Sokal--Sneath 2 S_N = 2(s+z)/(n+s+z)- See Also:
-
JIM_GOWER_LEGENDRE
public static final int JIM_GOWER_LEGENDREJIM: Gower--Legendre S_N = 2(s+z)/(n+s+z)- See Also:
-
JIM_SOKAL_SNEATH_3
public static final int JIM_SOKAL_SNEATH_3JIM: Sokal--Sneath 3 S_O = (s+z)/q- See Also:
-
JIM_BARONI_BUSER
public static final int JIM_BARONI_BUSERJIM: Baroni--Buser S_B = (x+s)/(x+r)- See Also:
-
JIM_GENERIC
public static final int JIM_GENERICJIM: generic measure S = (c_0s +c_1z +c_2n +c_3x) / (c_4s +c_5z +c_6n +c_7x)- See Also:
-
IDENT
public static final int IDENTsurrogate method: identity (keep original data)- See Also:
-
RANDOM
public static final int RANDOMsurrogate method: random transaction generation- See Also:
-
SWAP
public static final int SWAPsurrogate method: permutation by pair swaps- See Also:
-
SHUFFLE
public static final int SHUFFLEsurrogate method: shuffle table-derived data (columns)- See Also:
-
COLUMNS
pattern spectrum report format: three columns size (integer), support (integer) and (average) occurrence frequency (double)- See Also:
-
OBJECTS
pattern spectrum report format: objects of typePatSpecElem
- See Also:
-
-
Constructor Details
-
JNIFIM
public JNIFIM()Constructor.- Since:
- 2023.07.30 (Christian Borgelt)
-
-
Method Details
-
fim
public static Object[] fim(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int[] border) Java interface to frequent item set mining in C (very simplified interface).- Parameters:
tracts
- array of transactions to analyze, each of which is an array of integerswgts
- weights of the transactions (same array index)
(may benull
if the transactions do not carry weights; in this case each transaction receives a unit weight)target
- type of the item sets to find (SETS
,ALL
,FREQUENT
,CLOSED
,MAXIMAL
orGENERATORS
)supp
- minimum support of an item set
(positive: percentage, negative: absolute number)zmin
- minimum number of items per item setzmax
- maximum number of items per item setreport
- values to report with an item set
(multiple values possible except for "#" and "=", which have to form a one-character string)
"a": absolute item set support (number of transactions, integer)
"s": relative item set support (fraction of transactions, double)
"S": relative item set support (percentage of transactions, double)
"#": pattern spectrum in column format
"=": pattern spectrum withPatSpecElem
border
- array of support thresholds per item set size (item set size is index of this array); may benull
if this additional filtering is not needed- Returns:
- if report != "#" and report != "=": an array with k+1
elements, the first of which is an array that contains
the item sets as integer arrays, while the following k
elements contain arrays that correspond to the values
selected with the characters in report (each value in
these arrays corresponds to the item set at the same
array index in the first array).
if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
if report = '=': an array with objects of typePatSpecElem
, each of which specifies a pattern signature together with its occurrence frequency. - Since:
- 2014.09.26 (Christian Borgelt)
-
xfim
public static Object[] xfim(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int eval, int agg, double thresh, int[] border) Java interface to frequent item set mining in C (less simplified interface).- Parameters:
tracts
- array of transactions to analyze, each of which is an array of integerswgts
- weights of the transactions (same array index)
(may benull
if the transactions do not carry weights; in this case each transaction receives a unit weight)target
- type of the item sets to find (SETS
,ALL
,FREQUENT
,CLOSED
,MAXIMAL
orGENERATORS
)supp
- minimum support of an item set
(positive: percentage, negative: absolute number)zmin
- minimum number of items per item setzmax
- maximum number of items per item setreport
- values to report with an item set
(multiple values possible except for "#" and "=", which have to form a one-character string)
"a": absolute item set support (number of transactions, integer)
"s": relative item set support (fraction of transactions, double)
"S": relative item set support (percentage of transactions, double)
"e": value of item set evaluation measure (double)
"E": value of item set evaluation measure as a percentage (double)
"#": pattern spectrum in column format
"=": pattern spectrum withPatSpecElem
eval
- measure for item set evaluation
(NONE
,LDRATIO
,CONFIDENCE
,CONF
,CONFDIFF
,LIFT
,LIFTDIFF
,LIFTQUOT
,CONVICTION
,CVCT
,CVCTDIFF
,CVCTQUOT
,CPROB
,CONDPROB
,IMPORTANCE
,IMPORT
,CERTAINTY
,CERT
,CHI2
,CHI2PVAL
,YATES
,YATESPVAL
,INFO
,INFOPVAL
,FETPROB
,FETCHI2
,FETINFO
,FETSUPP
)agg
- evaluation measure aggregation mode
(NONE
,MIN
,MINIMUM
,MAX
,MAXIMUM
,AVG
,AVERAGE
)thresh
- threshold for evaluation measure (lower bound for measures for which larger is better, upper bound for measures for which smaller is better)border
- array of support thresholds per item set size (item set size is index of this array); may benull
if this additional filtering is not needed- Returns:
- if report != "#" and report != "=": an array with k+1
elements, the first of which is an array that contains
the item sets as integer arrays, while the following k
elements contain arrays that correspond to the values
selected with the characters in report (each value in
these arrays corresponds to the item set at the same
array index in the first array).
if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
if report = '=': an array with objects of typePatSpecElem
, each of which specifies a pattern signature together with its occurrence frequency. - Since:
- 2014.10.01 (Christian Borgelt)
-
arules
public static Object[] arules(int[][] tracts, int[] wgts, double supp, double conf, int zmin, int zmax, String report, int eval, double thresh, int mode, int[][] appear) Java interface to association rule induction in C.- Parameters:
tracts
- array of transactions to analyze, each of which is an array of integerswgts
- weights of the transactions (same array index)
(may benull
if the transactions do not carry weights; in this case each transaction receives a unit weight)supp
- minimum support of an association rule
(positive: percentage, negative: absolute number)conf
- minimum confidence of an association rulezmin
- minimum number of items per association rulezmax
- maximum number of items per association rulereport
- values to report with an item set
(multiple values possible except for "#" and "=", which have to form a one-character string)
"a": absolute item set support (number of transactions, integer)
"s": relative item set support (fraction of transactions, double)
"S": relative item set support (percentage of transactions, double)
"b": absolute body set support (number of transactions, integer)
"x": relative body set support as a fraction (double)
"X": relative body set support as a percentage (double)
"h": absolute head item support (number of transactions, integer)
"y": relative head item support as a fraction (double)
"Y": relative head item support as a percentage (double)
"c": rule confidence as a fraction (double)
"C": rule confidence as a percentage (double)
"l": lift value of a rule (confidence/prior) (double)
"L": lift value of a rule as a percentage (double)
"e": value of rule evaluation measure (double)
"E": value of rule evaluation measure (double) as a percentage
"#": pattern spectrum in column format
"=": pattern spectrum withPatSpecElem
eval
- measure for association rule evaluation
(NONE
,LDRATIO
,CONFIDENCE
,CONF
,CONFDIFF
,LIFT
,LIFTDIFF
,LIFTQUOT
,CONVICTION
,CVCT
,CVCTDIFF
,CVCTQUOT
,CPROB
,CONDPROB
,IMPORTANCE
,IMPORT
,CERTAINTY
,CERT
,CHI2
,CHI2PVAL
,YATES
,YATESPVAL
,INFO
,INFOPVAL
,FETPROB
,FETCHI2
,FETINFO
,FETSUPP
)thresh
- threshold for evaluation measure (lower bound for measures for which larger is better, upper bound for measures for which smaller is better)mode
- operation mode indicators/flags
(NONE
orORIGSUPP
)appear
- map from items to item appearance indicators as two integer arrays odf equal size, with the first holding the items, the second the corresponding item appearance indicators.
This parameter may benull
; and then items may appear anywhere in a rule.
The item appearance indicators must be one ofIGNORE
,NEITHER
,NONE
,BODY
,INPUT
,ANTE
,ANTECENDENT
,HEAD
,OUTPUT
,CONS
,CONSEQUENT
,BOTH
,INOUT
,CANDA
. The default appearance indicator is set via a pseudo-item which has a negative identifier.- Returns:
- if report != "#" and report != "=": an array with k+2
elements, the first of which is an integer array that
contains the head (consequent) items, while the second
contains the body (antecedent) item sets as integer
arrays; the following k array elements contain arrays
that correspond to the values selected with the characters
in report (each value in these arrays corresponds to the
association rule at the same array index in the first
and the second array).
if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
if report = '=': an array with objects of typePatSpecElem
, each of which specifies a pattern signature together with its occurrence frequency. - Since:
- 2014.09.26 (Christian Borgelt)
-
apriori
public static Object[] apriori(int[][] tracts, int[] wgts, int target, double supp, double conf, int zmin, int zmax, String report, int eval, int agg, double thresh, int prune, int algo, int mode, int[] border, int[][] appear) Java interface to Apriori algorithm in C.- Parameters:
tracts
- array of transactions to analyze, each of which is an array of integerswgts
- weights of the transactions (same array index)
(may benull
if the transactions do not carry weights; in this case each transaction receives a unit weight)target
- type of the item sets to find (SETS
,ALL
,FREQUENT
,CLOSED
,MAXIMAL
,GENERATORS
orRULES
)supp
- minimum support of an item set
(positive: percentage, negative: absolute number)conf
- minimum confidence of an association rulezmin
- minimum number of items per item setzmax
- maximum number of items per item setreport
- values to report with an item set
(multiple values possible except for "#" and "=", which have to form a one-character string)
"a": absolute item set support (number of transactions, integer)
"s": relative item set support (fraction of transactions, double)
"S": relative item set support (percentage of transactions, double)
"b": absolute body set support (number of transactions, integer)
"x": relative body set support as a fraction (double)
"X": relative body set support as a percentage (double)
"h": absolute head item support (number of transactions, integer)
"y": relative head item support as a fraction (double)
"Y": relative head item support as a percentage (double)
"c": rule confidence as a fraction (double)
"C": rule confidence as a percentage (double)
"l": lift value of a rule (confidence/prior) (double)
"L": lift value of a rule as a percentage (double)
"e": value of rule evaluation measure (double)
"E": value of rule evaluation measure (double) as a percentage
"#": pattern spectrum in column format
"=": pattern spectrum withPatSpecElem
eval
- measure for item set evaluation
(NONE
,LDRATIO
,CONFIDENCE
,CONF
,CONFDIFF
,LIFT
,LIFTDIFF
,LIFTQUOT
,CONVICTION
,CVCT
,CVCTDIFF
,CVCTQUOT
,CPROB
,CONDPROB
,IMPORTANCE
,IMPORT
,CERTAINTY
,CERT
,CHI2
,CHI2PVAL
,YATES
,YATESPVAL
,INFO
,INFOPVAL
,FETPROB
,FETCHI2
,FETINFO
,FETSUPP
)agg
- evaluation measure aggregation mode
(NONE
,MIN
,MINIMUM
,MAX
,MAXIMUM
,AVG
,AVERAGE
)thresh
- threshold for evaluation measure (lower bound for measures for which larger is better, upper bound for measures for which smaller is better)prune
- minimum size for evaluation filtering
= 0: backward filtering (no subset check)
< 0: weak forward filtering (one subset must qualify)
> 0: strong forward filtering (all subsets must qualify)algo
- algorithm variant to use
(AUTO
orAPRI_BASIC
)mode
- operation mode indicators/flags
(NONE
,NOPERFECT
,NOTREE
,POSTPRUNE
,INVBXS
,ORIGSUPP
)border
- array of support thresholds per item set size (item set size is index of this array); may benull
if this additional filtering is not neededappear
- map from items to item appearance indicators as two integer arrays odf equal size, with the first holding the items, the second the corresponding item appearance indicators.
This parameter is used only if target =RULES
. It may benull
; and then items may appear anywhere in a rule.
The item appearance indicators must be one ofIGNORE
,NEITHER
,NONE
,BODY
,INPUT
,ANTE
,ANTECENDENT
,HEAD
,OUTPUT
,CONS
,CONSEQUENT
,BOTH
,INOUT
,CANDA
. The default appearance indicator is set via a pseudo-item which has a negative identifier.- Returns:
- if report != "#" and report != "=":
if target =RULES
: an array with k+2 elements, the first of which is an integer array that contains the head (consequent) items, while the second contains the body (antecedent) item sets as integer arrays; the following k array elements contain arrays that correspond to the values selected with the characters in report (each value in these arrays corresponds to the association rule at the same array index in the first and the second array).
if target !=RULES
: an array with k+1 elements, the first of which is an array that contains the item sets as integer arrays, while the following k elements contain arrays that correspond to the values selected with the characters in report (each value in these arrays corresponds to the item set at the same array index in the first array).
if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
if report = '=': an array with objects of typePatSpecElem
, each of which specifies a pattern signature together with its occurrence frequency. - Since:
- 2014.10.01 (Christian Borgelt)
-
eclat
public static Object[] eclat(int[][] tracts, int[] wgts, int target, double conf, double supp, int zmin, int zmax, String report, int eval, int agg, double thresh, int prune, int algo, int mode, int[] border, int[][] appear) Java interface to Eclat algorithm in C.- Parameters:
tracts
- array of transactions to analyze, each of which is an array of integerswgts
- weights of the transactions (same array index)
(may benull
if the transactions do not carry weights; in this case each transaction receives a unit weight)target
- type of the item sets to find (SETS
,ALL
,FREQUENT
,CLOSED
,MAXIMAL
,GENERATORS
orRULES
)conf
- minimum confidence of an association rulesupp
- minimum support of an item set
(positive: percentage, negative: absolute number)zmin
- minimum number of items per item setzmax
- maximum number of items per item setreport
- values to report with an item set
(multiple values possible except for "#" and "=", which have to form a one-character string)
"a": absolute item set support (number of transactions, integer)
"s": relative item set support (fraction of transactions, double)
"S": relative item set support (percentage of transactions, double)
"b": absolute body set support (number of transactions, integer)
"x": relative body set support as a fraction (double)
"X": relative body set support as a percentage (double)
"h": absolute head item support (number of transactions, integer)
"y": relative head item support as a fraction (double)
"Y": relative head item support as a percentage (double)
"c": rule confidence as a fraction (double)
"C": rule confidence as a percentage (double)
"l": lift value of a rule (confidence/prior) (double)
"L": lift value of a rule as a percentage (double)
"e": value of rule evaluation measure (double)
"E": value of rule evaluation measure (double) as a percentage
"#": pattern spectrum in column format
"=": pattern spectrum withPatSpecElem
eval
- measure for item set evaluation
(NONE
,LDRATIO
,CONFIDENCE
,CONF
,CONFDIFF
,LIFT
,LIFTDIFF
,LIFTQUOT
,CONVICTION
,CVCT
,CVCTDIFF
,CVCTQUOT
,CPROB
,CONDPROB
,IMPORTANCE
,IMPORT
,CERTAINTY
,CERT
,CHI2
,CHI2PVAL
,YATES
,YATESPVAL
,INFO
,INFOPVAL
,FETPROB
,FETCHI2
,FETINFO
,FETSUPP
)agg
- evaluation measure aggregation mode
(NONE
,MIN
,MINIMUM
,MAX
,MAXIMUM
,AVG
,AVERAGE
)thresh
- threshold for evaluation measure (lower bound for measures for which larger is better, upper bound for measures for which smaller is better)prune
- minimum size for evaluation filtering
= 0: backward filtering (no subset check)
< 0: weak forward filtering (one subset must qualify)
> 0: strong forward filtering (all subsets must qualify)algo
- algorithm variant to use
(AUTO
,ECLAT_BASIC
,ECLAT_TIDS
,ECLAT_BITS
,ECLAT_TABLE
,ECLAT_SIMPLE
,ECLAT_RANGES
,ECLAT_OCCDLV
,ECLAT_DIFFS
)mode
- operation mode indicators/flags
(NONE
,NOFIM16
,NOPERFECT
,NOSORT
,NOHUT
,HORZ
,VERT
,INVBXS
,ORIGSUPP
)border
- array of support thresholds per item set size (item set size is index of this array); may benull
if this additional filtering is not neededappear
- map from items to item appearance indicators as two integer arrays odf equal size, with the first holding the items, the second the corresponding item appearance indicators.
This parameter is used only if target =RULES
. It may benull
; and then items may appear anywhere in a rule.
The item appearance indicators must be one ofIGNORE
,NEITHER
,NONE
,BODY
,INPUT
,ANTE
,ANTECENDENT
,HEAD
,OUTPUT
,CONS
,CONSEQUENT
,BOTH
,INOUT
,CANDA
. The default appearance indicator is set via a pseudo-item which has a negative identifier.- Returns:
- if report != "#" and report != "=":
if target =RULES
: an array with k+2 elements, the first of which is an integer array that contains the head (consequent) items, while the second contains the body (antecedent) item sets as integer arrays; the following k array elements contain arrays that correspond to the values selected with the characters in report (each value in these arrays corresponds to the association rule at the same array index in the first and the second array).
if target !=RULES
: an array with k+1 elements, the first of which is an array that contains the item sets as integer arrays, while the following k elements contain arrays that correspond to the values selected with the characters in report (each value in these arrays corresponds to the item set at the same array index in the first array).
if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
if report = '=': an array with objects of typePatSpecElem
, each of which specifies a pattern signature together with its occurrence frequency. - Since:
- 2015.02.27 (Christian Borgelt)
-
fpgrowth
public static Object[] fpgrowth(int[][] tracts, int[] wgts, int target, double supp, double conf, int zmin, int zmax, String report, int eval, int agg, double thresh, int prune, int algo, int mode, int[] border, int[][] appear) Java interface to FP-growth algorithm in C.- Parameters:
tracts
- array of transactions to analyze, each of which is an array of integerswgts
- weights of the transactions (same array index)
(may benull
if the transactions do not carry weights; in this case each transaction receives a unit weight)target
- type of the item sets to find (SETS
,ALL
,FREQUENT
,CLOSED
,MAXIMAL
,GENERATORS
orRULES
)supp
- minimum support of an item set
(positive: percentage, negative: absolute number)conf
- minimum confidence of an association rulezmin
- minimum number of items per item setzmax
- maximum number of items per item setreport
- values to report with an item set
(multiple values possible except for "#" and "=", which have to form a one-character string)
"a": absolute item set support (number of transactions, integer)
"s": relative item set support (fraction of transactions, double)
"S": relative item set support (percentage of transactions, double)
"b": absolute body set support (number of transactions, integer)
"x": relative body set support as a fraction (double)
"X": relative body set support as a percentage (double)
"h": absolute head item support (number of transactions, integer)
"y": relative head item support as a fraction (double)
"Y": relative head item support as a percentage (double)
"c": rule confidence as a fraction (double)
"C": rule confidence as a percentage (double)
"l": lift value of a rule (confidence/prior) (double)
"L": lift value of a rule as a percentage (double)
"e": value of rule evaluation measure (double)
"E": value of rule evaluation measure (double) as a percentage
"#": pattern spectrum in column format
"=": pattern spectrum withPatSpecElem
eval
- measure for item set evaluation
(NONE
,LDRATIO
,CONFIDENCE
,CONF
,CONFDIFF
,LIFT
,LIFTDIFF
,LIFTQUOT
,CONVICTION
,CVCT
,CVCTDIFF
,CVCTQUOT
,CPROB
,CONDPROB
,IMPORTANCE
,IMPORT
,CERTAINTY
,CERT
,CHI2
,CHI2PVAL
,YATES
,YATESPVAL
,INFO
,INFOPVAL
,FETPROB
,FETCHI2
,FETINFO
,FETSUPP
)agg
- evaluation measure aggregation mode
(NONE
,MIN
,MINIMUM
,MAX
,MAXIMUM
,AVG
,AVERAGE
)thresh
- threshold for evaluation measure (lower bound for measures for which larger is better, upper bound for measures for which smaller is better)prune
- minimum size for evaluation filtering
= 0: backward filtering (no subset check)
< 0: weak forward filtering (one subset must qualify)
> 0: strong forward filtering (all subsets must qualify)algo
- algorithm variant to use
(AUTO
,FPG_SIMPLE
,FPG_COMPLEX
,FPG_SINGLE
,FPG_TOPDOWN
)mode
- operation mode indicators/flags
(NONE
,NOFIM16
,NOPERFECT
,NOSORT
,NOHUT
,INVBXS
,ORIGSUPP
)border
- array of support thresholds per item set size (item set size is index of this array); may benull
if this additional filtering is not neededappear
- map from items to item appearance indicators as two integer arrays odf equal size, with the first holding the items, the second the corresponding item appearance indicators.
This parameter is used only if target =RULES
. It may benull
; and then items may appear anywhere in a rule.
The item appearance indicators must be one ofIGNORE
,NEITHER
,NONE
,BODY
,INPUT
,ANTE
,ANTECENDENT
,HEAD
,OUTPUT
,CONS
,CONSEQUENT
,BOTH
,INOUT
,CANDA
. The default appearance indicator is set via a pseudo-item which has a negative identifier.- Returns:
- if report != "#" and report != "=":
if target =RULES
: an array with k+2 elements, the first of which is an integer array that contains the head (consequent) items, while the second contains the body (antecedent) item sets as integer arrays; the following k array elements contain arrays that correspond to the values selected with the characters in report (each value in these arrays corresponds to the association rule at the same array index in the first and the second array).
if target !=RULES
: an array with k+1 elements, the first of which is an array that contains the item sets as integer arrays, while the following k elements contain arrays that correspond to the values selected with the characters in report (each value in these arrays corresponds to the item set at the same array index in the first array).
if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
if report = '=': an array with objects of typePatSpecElem
, each of which specifies a pattern signature together with its occurrence frequency. - Since:
- 2014.10.01 (Christian Borgelt)
-
sam
public static Object[] sam(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int eval, double thresh, int algo, int mode, int[] border) Java interface to SaM algorithm in C.- Parameters:
tracts
- array of transactions to analyze, each of which is an array of integerswgts
- weights of the transactions (same array index)
(may benull
if the transactions do not carry weights; in this case each transaction receives a unit weight)target
- type of the item sets to find (SETS
,ALL
,FREQUENT
,CLOSED
orMAXIMAL
)supp
- minimum support of an item set
(positive: percentage, negative: absolute number)zmin
- minimum number of items per item setzmax
- maximum number of items per item setreport
- values to report with an item set
(multiple values possible except for "#" and "=", which have to form a one-character string)
"a": absolute item set support (number of transactions, integer)
"s": relative item set support (fraction of transactions, double)
"S": relative item set support (percentage of transactions, double)
"e": value of item set evaluation measure (double)
"E": value of item set evaluation measure as a percentage (double)
"#": pattern spectrum in column format
"=": pattern spectrum withPatSpecElem
eval
- measure for item set evaluation
(NONE
,LDRATIO
)thresh
- threshold for evaluation measurealgo
- algorithm variant to use
(AUTO
,SAM_SIMPLE
,SAM_BSEARCH
,SAM_DOUBLE
,SAM_TREE
)mode
- operation mode indicators/flags
(NONE
,NOFIM16
,NOPERFECT
)border
- array of support thresholds per item set size (item set size is index of this array); may benull
if this additional filtering is not needed- Returns:
- if report != "#" and report != "=": an array with k+1
elements, the first of which is an array that contains
the item sets as integer arrays, while the following k
elements contain arrays that correspond to the values
selected with the characters in report (each value in
these arrays corresponds to the item set at the same
array index in the first array).
if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
if report = '=': an array with objects of typePatSpecElem
, each of which specifies a pattern signature together with its occurrence frequency. - Since:
- 2014.10.01 (Christian Borgelt)
-
relim
public static Object[] relim(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int eval, double thresh, int algo, int mode, int[] border) Java interface to RElim algorithm in C.- Parameters:
tracts
- array of transactions to analyze, each of which is an array of integerswgts
- weights of the transactions (same array index)
(may benull
if the transactions do not carry weights; in this case each transaction receives a unit weight)target
- type of the item sets to find (SETS
,ALL
,FREQUENT
,CLOSED
orMAXIMAL
)supp
- minimum support of an item set
(positive: percentage, negative: absolute number)zmin
- minimum number of items per item setzmax
- maximum number of items per item setreport
- values to report with an item set
(multiple values possible except for "#" and "=", which have to form a one-character string)
"a": absolute item set support (number of transactions, integer)
"s": relative item set support (fraction of transactions, double)
"S": relative item set support (percentage of transactions, double)
"e": value of item set evaluation measure (double)
"E": value of item set evaluation measure as a percentage (double)
"#": pattern spectrum in column format
"=": pattern spectrum withPatSpecElem
eval
- measure for item set evaluation (NONE
,LDRATIO
)thresh
- threshold for evaluation measurealgo
- algorithm variant to use
(AUTO
,RELIM_BASIC
)mode
- operation mode indicators/flags
(NONE
,NOFIM16
,NOPERFECT
)border
- array of support thresholds per item set size (item set size is index of this array); may benull
if this additional filtering is not needed- Returns:
- if report != "#" and report != "=": an array with k+1
elements, the first of which is an array that contains
the item sets as integer arrays, while the following k
elements contain arrays that correspond to the values
selected with the characters in report (each value in
these arrays corresponds to the item set at the same
array index in the first array).
if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
if report = '=': an array with objects of typePatSpecElem
, each of which specifies a pattern signature together with its occurrence frequency. - Since:
- 2014.10.01 (Christian Borgelt)
-
jim
public static Object[] jim(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int eval, double thresh, int covsim, double[] simps, double sim, int algo, int mode, int[] border) Java interface to JIM algorithm in C.- Parameters:
tracts
- array of transactions to analyze, each of which is an array of integerswgts
- weights of the transactions (same array index)
(may benull
if the transactions do not carry weights; in this case each transaction receives a unit weight)target
- type of the item sets to find (SETS
,ALL
,FREQUENT
,CLOSED
orMAXIMAL
)supp
- minimum support of an item set
(positive: percentage, negative: absolute number)zmin
- minimum number of items per item setzmax
- maximum number of items per item setreport
- values to report with an item set
(multiple values possible except for "#" and "=", which have to form a one-character string)
"a": absolute item set support (number of transactions, integer)
"s": relative item set support (fraction of transactions, double)
"S": relative item set support (percentage of transactions, double)
"e": value of item set evaluation measure (double)
"E": value of item set evaluation measure as a percentage (double)
"#": pattern spectrum in column format
"=": pattern spectrum withPatSpecElem
eval
- measure for item set evaluation
(NONE
,LDRATIO
)thresh
- threshold for evaluation measurecovsim
- cover similarity measure (JIM_NONE
,JIM_RUSSEL_RAO
,JIM_KULCZYNSKI
,JIM_JACCARD
,JIM_TANIMOTO
,JIM_DICE
,JIM_SORENSEN
,JIM_CZEKANOWKSI
,JIM_SOKAL_SNEATH_1
,JIM_SOKAL_MICHENER
,JIM_HAMMING
,JIM_FAITH
,JIM_ROGERS_TANIMOTO
,JIM_SOKAL_SNEATH_2
,JIM_GOWER_LEGENDRE
,JIM_SOKAL_SNEATH_3
,JIM_BARONI_BUSER
,JIM_GENERIC
)simps
- cover similarity measure parameters (if generic) S = (c_0s +c_1z +c_2n +c_3x) / (c_4s +c_5z +c_6n +c_7x)sim
- threshold for cover similarity measurealgo
- algorithm variant to use
(AUTO
,SAM_SIMPLE
,SAM_BSEARCH
,SAM_DOUBLE
,SAM_TREE
)mode
- operation mode indicators/flags
(NONE
,NOFIM16
,NOPERFECT
)border
- array of support thresholds per item set size (item set size is index of this array); may benull
if this additional filtering is not needed- Returns:
- if report != "#" and report != "=": an array with k+1
elements, the first of which is an array that contains
the item sets as integer arrays, while the following k
elements contain arrays that correspond to the values
selected with the characters in report (each value in
these arrays corresponds to the item set at the same
array index in the first array).
if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
if report = '=': an array with objects of typePatSpecElem
, each of which specifies a pattern signature together with its occurrence frequency. - Since:
- 2018.03.21 (Christian Borgelt)
-
carpenter
public static Object[] carpenter(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int eval, double thresh, int algo, int mode, int[] border) Java interface to Carpenter algorithm in C.- Parameters:
tracts
- array of transactions to analyze, each of which is an array of integerswgts
- weights of the transactions (same array index)
(may benull
if the transactions do not carry weights; in this case each transaction receives a unit weight)target
- type of the item sets to find (CLOSED
orMAXIMAL
)supp
- minimum support of an item set
(positive: percentage, negative: absolute number)zmin
- minimum number of items per item setzmax
- maximum number of items per item setreport
- values to report with an item set
(multiple values possible except for "#" and "=", which have to form a one-character string)
"a": absolute item set support (number of transactions, integer)
"s": relative item set support (fraction of transactions, double)
"S": relative item set support (percentage of transactions, double)
"e": value of item set evaluation measure (double)
"E": value of item set evaluation measure as a percentage (double)
"#": pattern spectrum in column format
"=": pattern spectrum withPatSpecElem
eval
- measure for item set evaluation (NONE
,LDRATIO
)thresh
- threshold for evaluation measurealgo
- algorithm variant to use
(AUTO
,CARP_TABLE
,CARP_TIDLIST
)mode
- operation mode indicators/flags
(NONE
,NOPERFECT
,REPOFILT
,MAXONLY
,NOCOLLATE
)border
- array of support thresholds per item set size (item set size is index of this array); may benull
if this additional filtering is not needed- Returns:
- if report != "#" and report != "=": an array with k+1
elements, the first of which is an array that contains
the item sets as integer arrays, while the following k
elements contain arrays that correspond to the values
selected with the characters in report (each value in
these arrays corresponds to the item set at the same
array index in the first array).
if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
if report = '=': an array with objects of typePatSpecElem
, each of which specifies a pattern signature together with its occurrence frequency. - Since:
- 2014.10.01 (Christian Borgelt)
-
ista
public static Object[] ista(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int eval, double thresh, int algo, int mode, int[] border) Java interface to IsTa algorithm in C.- Parameters:
tracts
- array of transactions to analyze, each of which is an array of integerswgts
- weights of the transactions (same array index)
(may benull
if the transactions do not carry weights; in this case each transaction receives a unit weight)target
- type of the item sets to find (CLOSED
orMAXIMAL
)supp
- minimum support of an item set
(positive: percentage, negative: absolute number)zmin
- minimum number of items per item setzmax
- maximum number of items per item setreport
- values to report with an item set
(multiple values possible except for "#" and "=", which have to form a one-character string)
"a": absolute item set support (number of transactions, integer)
"s": relative item set support (fraction of transactions, double)
"S": relative item set support (percentage of transactions, double)
"e": value of item set evaluation measure (double)
"E": value of item set evaluation measure as a percentage (double)
"#": pattern spectrum in column format
"=": pattern spectrum withPatSpecElem
eval
- measure for item set evaluation (NONE
,LDRATIO
)thresh
- threshold for evaluation measurealgo
- algorithm variant to use
(AUTO
,ISTA_PREFIX
,ISTA_PATRICIA
)mode
- operation mode indicators/flags
(NONE
,NOPRUNE
,REPOFILT
)border
- array of support thresholds per item set size (item set size is index of this array); may benull
if this additional filtering is not needed- Returns:
- if report != "#" and report != "=": an array with k+1
elements, the first of which is an array that contains
the item sets as integer arrays, while the following k
elements contain arrays that correspond to the values
selected with the characters in report (each value in
these arrays corresponds to the item set at the same
array index in the first array).
if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
if report = '=': an array with objects of typePatSpecElem
, each of which specifies a pattern signature together with its occurrence frequency. - Since:
- 2014.10.01 (Christian Borgelt)
-
apriacc
public static Object[] apriacc(int[][] tracts, int[] wgts, double supp, int zmin, int zmax, String report, int stat, double siglvl, int prune, int mode, int[] border) Java interface to accretion-style Apriori algorithm in C.- Parameters:
tracts
- array of transactions to analyze, each of which is an array of integerswgts
- weights of the transactions (same array index)
(may benull
if the transactions do not carry weights; in this case each transaction receives a unit weight)supp
- minimum support of an item set
(positive: percentage, negative: absolute number)zmin
- minimum number of items per item setzmax
- maximum number of items per item setreport
- values to report with an item set
(multiple values possible except for "#" and "=", which have to form a one-character string)
"a": absolute item set support (number of transactions, integer)
"s": relative item set support (fraction of transactions, double)
"S": relative item set support (percentage of transactions, double)
"p": p-value of item set test as a fraction (double)
"P": p-value of item set test as a percentage (double)
"#": pattern spectrum in column format
"=": pattern spectrum withPatSpecElem
stat
- test statistic for item set evaluation (NONE
,CHI2PVAL
,YATESPVAL
,INFOPVAL
,FETPROB
,FETCHI2
,FETINFO
,FETSUPP
)siglvl
- significance level (maximum p-value)prune
- minimum size for evaluation filtering
= 0: backward filtering (no subset check)
< 0: weak forward filtering (one subset must qualify)
> 0: strong forward filtering (all subsets must qualify)mode
- operation mode indicators/flags
(NONE
,INVBXS
)border
- array of support thresholds per item set size (item set size is index of this array); may benull
if this additional filtering is not needed- Returns:
- if report != "#" and report != "=": an array with k+1
elements, the first of which is an array that contains
the item sets as integer arrays, while the following k
elements contain arrays that correspond to the values
selected with the characters in report (each value in
these arrays corresponds to the item set at the same
array index in the first array).
if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
if report = '=': an array with objects of typePatSpecElem
, each of which specifies a pattern signature together with its occurrence frequency. - Since:
- 2014.10.01 (Christian Borgelt)
-
accretion
public static Object[] accretion(int[][] tracts, int[] wgts, double supp, int zmin, int zmax, String report, int stat, double siglvl, int maxext, int mode, int[] border) Java interface to Accretion algorithm in C.- Parameters:
tracts
- array of transactions to analyze, each of which is an array of integerswgts
- weights of the transactions (same array index)
(may benull
if the transactions do not carry weights; in this case each transaction receives a unit weight)supp
- minimum support of an item set
(positive: percentage, negative: absolute number)zmin
- minimum number of items per item setzmax
- maximum number of items per item setreport
- values to report with an item set
(multiple values possible except for "#" and "=", which have to form a one-character string)
"a": absolute item set support (number of transactions, integer)
"s": relative item set support (fraction of transactions, double)
"S": relative item set support (percentage of transactions, double)
"p": p-value of item set test as a fraction (double)
"P": p-value of item set test as a percentage (double)
"#": pattern spectrum in column format
"=": pattern spectrum withPatSpecElem
stat
- test statistic for item set evaluation (NONE
,CHI2PVAL
,YATESPVAL
,INFOPVAL
,FETPROB
,FETCHI2
,FETINFO
,FETSUPP
)siglvl
- significance level (maximum p-value)maxext
- maximum number of extension itemsmode
- operation mode indicators/flags
(NONE
,INVBXS
)border
- array of support thresholds per item set size (item set size is index of this array); may benull
if this additional filtering is not needed- Returns:
- if report != "#" and report != "=": an array with k+1
elements, the first of which is an array that contains
the item sets as integer arrays, while the following k
elements contain arrays that correspond to the values
selected with the characters in report (each value in
these arrays corresponds to the item set at the same
array index in the first array).
if report = "#": an array with three elements; the first array contains the item set sizes as integers, the second array contains the support values as integers (that is, corresponding elements of the first and the second array form a pattern signature), and the third array contains the frequency (number of occurrences) of the size/support pair (at the same array index in the first two arrays) as a double precision floating point value.
if report = '=': an array with objects of typePatSpecElem
, each of which specifies a pattern signature together with its occurrence frequency. - Since:
- 2014.10.01 (Christian Borgelt)
-
genpsp
public static Object[] genpsp(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int cnt, int surr, int seed, int cpus, int[] ctrl) Pattern spectrum generation with surrogate data sets.- Parameters:
tracts
- array of transactions to process, each of which is an array of integerswgts
- weights of the transactions (same array index)
(may benull
if the transactions do not carry weights; in this case each transaction receives a unit weight)target
- type of the item sets to find (SETS
,ALL
orFREQUENT
)supp
- minimum support of an item set
(positive: percentage, negative: absolute number)zmin
- minimum number of items per item setzmax
- maximum number of items per item setreport
- format in which to report the pattern spectrum
"#": pattern spectrum in column format
"=": pattern spectrum withPatSpecElem
cnt
- number of surrogate data sets to generatesurr
- surrogate data generation method (IDENT
,RANDOM
,SWAP
orSHUFFLE
)seed
- seed value for random number generatorcpus
- number of cpus/threads to usectrl
- control array (progress indicator, stop flag)- Returns:
- if report = "#": an array with three elements; the first
array contains the item set sizes as integers, the second
array contains the support values as integers (that is,
corresponding elements of the first and the second array
form a pattern signature), and the third array contains
the frequency (number of occurrences) of the size/support
pair (at the same array index in the first two arrays)
as a double precision floating point value.
if report = '=': an array with objects of typePatSpecElem
, each of which specifies a pattern signature together with its occurrence frequency. - Since:
- 2014.09.26 (Christian Borgelt)
-
estpsp
public static Object[] estpsp(int[][] tracts, int[] wgts, int target, double supp, int zmin, int zmax, String report, int equiv, double alpha, int smpls, int seed) Estimate a pattern spectrum from data characteristics.- Parameters:
tracts
- array of transactions to analyze, each of which is an array of integerswgts
- weights of the transactions (same array index)
(may benull
if the transactions do not carry weights; in this case each transaction receives a unit weight)target
- type of the item sets to find (SETS
,ALL
orFREQUENT
)supp
- minimum support of an item set
(positive: percentage, negative: absolute number)zmin
- minimum number of items per item setzmax
- maximum number of items per item setreport
- format in which to report the pattern spectrum
"#": pattern spectrum in column format
"=": pattern spectrum withPatSpecElem
equiv
- equivalent number of surrogate data setsalpha
- probability dispersion factorsmpls
- number of samples per item set sizeseed
- seed value for random number generator- Returns:
- if report = "#": an array with three elements; the first
array contains the item set sizes as integers, the second
array contains the support values as integers (that is,
corresponding elements of the first and the second array
form a pattern signature), and the third array contains
the frequency (number of occurrences) of the size/support
pair (at the same array index in the first two arrays)
as a double precision floating point value.
if report = '=': an array with objects of typePatSpecElem
, each of which specifies a pattern signature together with its occurrence frequency. - Since:
- 2014.09.26 (Christian Borgelt)
-
abort
public static void abort(int state) Set the abort state (abort computations or clear abort state).- Parameters:
state
- abort state to set (0: clear; != 0: signal abort)- Since:
- 2015.03.05 (Christian Borgelt)
-