Home Random Page


CATEGORIES:

BiologyChemistryConstructionCultureEcologyEconomyElectronicsFinanceGeographyHistoryInformaticsLawMathematicsMechanicsMedicineOtherPedagogyPhilosophyPhysicsPolicyPsychologySociologySportTourism






Pseudo-Triclustering

In addition to normal triclustering methods one alternative was suggested. It was called pseudo-triclustering because it forms tricluster-like structures of biclusters in order to make some assumptions about the structure of data.

A normal triadic formal context consists of three dyadic contexts thus forming the ternary relation over three sets. In this case there are only two dyadic contexts and some assumption must be made about the third. In order to do this rather strong assumption was made: if for the formal concepts and respectively pairs and are dense biclusters ( , and ) then they will form the following tricluster: . Thus some connection between and is suggested.

For these pseudo-triclusters two measures were suggested. The first one resembles normal density:

It shows the correspondence between sets that formed the extent of the pseudo-tricluster.

The second one is the average of the densities of the basic biclusters:

It is clear that these two measures vary from 0 to 1 inclusive.

Based on this measure some information can be received about the quality of the basic biclusters.

Results

To the moment some computations have been made for the pseudo-triclustering method. Tests were made for the partial database of Russian social networking site Vkontakte.ru. There were three sets: users (7161), interests (4994), and groups (108101); and there were two binary relations on them: interests and groups chosen by specific users: The main goal was to assign interests to groups as tags. The computational results are summarized in the following tables:

The first table shows results for the biclustering of the contexts:

 

 

Table 1. Results for biclustering

Minimal admissible density User-interest biclustering User-group biclustering
Time, ms # of biclusters Time, ms # of biclusters
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9

 

After the biclustering bicluster sets with density greater than 0.5 were chosen and pseudo-triclustering algorithm was applied to them with the following results:

Table 2. Results for pseudo-triclustering

Minimal admissible Time, ms # of pseudo triclusters
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9

 



Received pseudo-triclusters still need some interpretations, but received results are rather promising, for some number of sensible biclusters as well as formal concepts has been received. Moreover, coherent triclusters of adequate size have been received. This has made possible to make conclusion that the interpretation of the results is expected to be of good quality.

Conclusion

In this paper basic formal concept analysis theory was considered, as well as some methods for finding triclusters and triconcepts, with OAC-clustering being the focus for it. It has been made clear that triclustering is the efficient group of method for analyzing various collections of data. The concept of pseudo-triclustering has been introduced and clarified. Despite the fact that results are yet to be received they are expected to be rather promising, especially for the pseudo triclustering method.

As it was shown in the result passage, these methods are very well suited for real-world dataset analysis and a lot of applications are possible, for instance social networks and folksonomies analysis, reports analysis, etc. This explains the high urgency of similar studies in the areas of data analysis. There is still a great need for improvement of existing algorithms as their computational efficiency is far from ideal. There is a variety of different parts of these algorithms that can be modified such as density calculation or possible halt criterions for OAC triclustering.

Also it is possible to generalize some of these methods for the case of fuzzy sets and thus to increase even more the range of possible applications of triclustering or its efficiency for the existing ones.


 

Bibliography

 

1. Aleskerov F., Khabina E., Svartz D. Binary Relations, Graphs and Social Decisions. - Moscow: Publishing House of Higher School of Economics, 2006. - 300 p.
2. Ganter B., Wille R. Formal Concept Analysis: Mathematical Foundations. - Springer, 1999.
3. Ignatov D. I., Kaminskaya A. Yu., Kuznetsov S. O., Magizov R. A.A Concept-based Biclustering Algorithm // IIP-8. - Paphos, 2010.
4. Ignatov D. I., Kuznetsov S. O., Magizov R. A., Zhukov L. E.From Triconcepts to Triclusters // RSFDGrC 2011. - Moscow, 2011. - pp. 257-264.
5. Jäschke R., Hotho A., Schmitz C., Ganter B., Stumme G.TRIAS - An Algorithm for Mining Iceberg Tri-Lattices // ICDM, 2006. - pp. 907-911.
6. Mirkin B. G., Kramarenko A. V.Approximate Bicluster and Tricluster Boxes in the Analysis of Binary Data // RSFDGrC. - Moscow, 2011. - pp. 248-256.

 

 


Date: 2015-12-24; view: 567


<== previous page | next page ==>
Comparison of Triclustering Methods | GLOBAL REPORTING INITIATIVE
doclecture.net - lectures - 2014-2024 year. Copyright infringement or personal data (0.007 sec.)