Class Recoder
- All Implemented Interfaces:
Serializable
Since it can speed up the mining process for frequent substructures considerably if the node types are processed in increasing order of their frequency, it is advisable to recode the node types to reflect the frequency order.
A recoder is implemented as a hash table (for encoding types) and an accompanying array (for decoding type codes).
- Since:
- 2006.08.10
- See Also:
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionint
add
(int type) Add a type to the recoder.void
clear
(int code) Clear the frequency and support of a type.void
commit()
Commit a type code counting.void
count
(int code) Count a type code.int
decode
(int code) Decode a type code, that is, retrieve the original type value.int
encode
(int type) Encode a type, that is, retrieve its code.void
exclude
(int code) Mark a type as excluded.int
getFreq
(int type) Get the frequency of a type (number of occurrences).int
getSupp
(int type) Get the support of a type (number of containing graphs).boolean
isExcluded
(int code) Check whether a type is excluded.boolean
isMaximal
(int code) Check whether a code has maximal frequency.void
maximize
(int code) Set frequency and support to a maximal value.int
size()
Get the size of the recoder.void
sort()
Sort types w.r.t. their frequency.void
trim
(boolean freq, int min) Trim the recoder with a minimum support or frequency.void
trim
(int min) Trim the recoder with a minimum support.void
trimFreq
(int min) Trim the recoder with a minimum frequency.void
trimSupp
(int min) Trim the recoder with a minimum support.
-
Constructor Details
-
Recoder
public Recoder()Create a recoder of default size.- Since:
- 2006.08.10 (Christian Borgelt)
-
-
Method Details
-
size
public int size()Get the size of the recoder.The size is the number of stored type/code pairs.
- Returns:
- the size of the recoder (number of types)
- Since:
- 2006.08.10 (Christian Borgelt)
-
add
public int add(int type) Add a type to the recoder.The added type is assigned the next code, which is the size of the recoder before the new type was added. This ensures that type codes are consecutive integers starting at 0.
- Parameters:
type
- the type to add to the recoder- Returns:
- the code that is assigned to the type
- Since:
- 2006.08.10 (Christian Borgelt)
-
encode
public int encode(int type) Encode a type, that is, retrieve its code.- Parameters:
type
- the type to encode- Returns:
- the code of the given type or -1 if the type is not contained in the recoder
- Since:
- 2006.08.10 (Christian Borgelt)
-
decode
public int decode(int code) Decode a type code, that is, retrieve the original type value.- Parameters:
code
- the type code to decode- Returns:
- the original type value associated with the code or the value of the code itself if the recoder does not contain a corresponding type.
- Since:
- 2006.08.10 (Christian Borgelt)
-
count
public void count(int code) Count a type code.Increment the internal counters for the frequency (and maybe also for the support) of this type code.
- Parameters:
code
- the type code to count- Since:
- 2006.08.10 (Christian Borgelt)
-
commit
public void commit()Commit a type code counting.This function must be called after each graph for which the types of its node have been counted, so that the support of a type (number of graphs that contain a node of a given type) can be determined.
- Since:
- 2006.08.10 (Christian Borgelt)
- See Also:
-
getFreq
public int getFreq(int type) Get the frequency of a type (number of occurrences).- Parameters:
type
- the type of which to get the frequency- Returns:
- the frequency of the type
- Since:
- 2010.01.21 (Christian Borgelt)
-
getSupp
public int getSupp(int type) Get the support of a type (number of containing graphs).- Parameters:
type
- the type of which to get the support- Returns:
- the support of the type
- Since:
- 2010.01.21 (Christian Borgelt)
-
trim
public void trim(boolean freq, int min) Trim the recoder with a minimum support or frequency.All types having a support or a frequency less than the given minimum support or frequency are marked as excluded. Note that the types with a lower support are only marked, not actually removed from the recoder. Hence it is possible to reactivate them, for example by calling the function
clear()
for such a type.- Parameters:
freq
- whether to trim (also) by frequencymin
- the minimum support of a type- Since:
- 2006.08.10 (Christian Borgelt)
- See Also:
-
trim
public void trim(int min) Trim the recoder with a minimum support.- Parameters:
min
- the minimum support of a type- Since:
- 2007.06.25 (Christian Borgelt)
- See Also:
-
trimSupp
public void trimSupp(int min) Trim the recoder with a minimum support.- Parameters:
min
- the minimum support of a type- Since:
- 2006.08.10 (Christian Borgelt)
- See Also:
-
trimFreq
public void trimFreq(int min) Trim the recoder with a minimum frequency.- Parameters:
min
- the minimum frequency of a type- Since:
- 2007.06.22 (Christian Borgelt)
- See Also:
-
clear
public void clear(int code) Clear the frequency and support of a type.Calling this function also removes a possible marking of the type as excluded.
- Parameters:
code
- the code of the type for which to clear the counters- Since:
- 2006.08.10 (Christian Borgelt)
-
exclude
public void exclude(int code) Mark a type as excluded.Note that marking a type as excluded loses its frequency and support information. Excluded types will be sorted to the front with the function
sort()
, that is, by sorting excluded types will be assigned the lowest codes.- Parameters:
code
- the code of the type to mark as excluded- Since:
- 2006.08.10 (Christian Borgelt)
-
isExcluded
public boolean isExcluded(int code) Check whether a type is excluded.Types can be excluded by explicitely calling the function
exclude()
or by trimming the recoder with a minimum frequency (by calling the functiontrim()
).- Parameters:
code
- the code of the type to check- Returns:
- whether the type is excluded
- Since:
- 2006.08.10 (Christian Borgelt)
- See Also:
-
maximize
public void maximize(int code) Set frequency and support to a maximal value.This function indirectly offers the possibility to move a type to the end of the recoder. Since types are sorted w.r.t. to their frequency, a type with maximal frequency will end up at the end of the recoder. This is needed if certain types are to be treated in a special way, independent of their frequency in the graph database.
- Parameters:
code
- the code of the type for which to maximize the frequency- Since:
- 2006.08.10 (Christian Borgelt)
-
isMaximal
public boolean isMaximal(int code) Check whether a code has maximal frequency.- Parameters:
code
- the code of the type to check- Returns:
- whether the code has maximal frequency
- Since:
- 2006.08.10 (Christian Borgelt)
- See Also:
-
sort
public void sort()Sort types w.r.t. their frequency.The types are sorted ascendingly w.r.t. their frequency, so that the least frequent type receives the code 0, the next frequent the code 1 etc. Excluded types precede all non-excluded types, maximized type succeed all other types.
- Since:
- 2006.08.10 (Christian Borgelt)
-