This book addresses the problem of text classification. The main motivation is the accurate classification of medical text reports. Such documents contain important information about patients, disease progression and management, but are difficult to analyse and query with conventional techniques due to their unstructured nature. We show how these medical reports can be classified automatically with a high degree of accuracy. A novel method is developed for accurate classification of medical reports. The method uses clustering as a pre-processing step to improve the final classification accuracy. The work requires the investigation of different methods for document representation, clustering and classification. In addition, it requires the use of Natural Language Processing tools. A new approach that requires minimal labelling effort, is found to be an effective classification tool for this task. Results show that the approach produces good classification performance on a real-world medical problem. Importantly, the addition of clustering features further improves the accuracy of the final classifier. Results are cross-checked using different medical classification tasks
Hvis denne bog ikke er noget for dig, kan du benytte kategorierne nedenfor til at finde andre titler. Klik på en kategori for at se lignende bøger.