Digital Humanities at Scale: Leveraging Big Data to Rethink Canon Formation in English Literature
Аннотация
This study proposes a data-driven approach to the English literary canon's creation using large-scale textual analysis, intertextual citation networks, and dynamic semantic-temporal models. The framework uses word frequency, vocabulary diversity, and grammatical trends from a large collection of English literary works. These features are weighted vectors and aggregated using similarity graphs to find literary metadata groupings. Canonical relevance scores consider citations and clusters to indicate how the canon was created historically and conceptually. Intertextual citation networks improve canonical relevance measurement by spreading influence scores across weighted citation graphs. Direct and indirect literary effects are linked. On the fly, semantic similarity matrices and temporal decay functions determine a work's canonical status. Canonical scores demonstrate the work's value evolution. The suggested method is better than topic modeling with LDA, TF-IDF-based canon scoring, and citation count ranking in terms of accuracy, precision, recall, F1-score, scalability, and interpretability. It performs well in processing speed, data efficiency, robustness, flexibility, noise tolerance, and usability. This methodology, which combines rigorous analysis with real-world implementation, gives digital humanities academics a flexible instrument for researching and reinterpreting literary history. Its adaptable approach honors the canon's diversity. It raises computational canon research standards and encourages more nuanced, open, and complete literary study.
Ҳали таржима қилинмаган