Exploring the Mysteries of Franco's New Year Messages
Welcome, esteemed colleagues and inquisitive minds, to an enlightening journey into the realm of political discourse and authorship attribution! I am Dr. Alma Valencia Uy, and it is with great pleasure that I present my groundbreaking research, which addresses the compelling question: Who truly authored Francisco Franco's New Year messages?
My scholarly thesis, titled "In Franco's Words: A Stylometric Analysis for the Authorship of New Year Speeches - Theory and Practice," deftly combines the precision of digital humanities and the analytical prowess of computational stylistics to unveil the concealed contributors behind these historic communications. This work not only elucidates the literary style and linguistic patterns intrinsic to Franco but also uncovers the collaborative endeavors that shaped these momentous political documents.
Join me as we delve into the intricate world of stylometry and digital humanities, showcasing the profound value of these methodologies in literary research. My work sets a distinguished precedent for future inquiries into political discourse and authorship attribution, emphasizing the multifaceted nature of historical texts.
Prepare to be enthralled by the meticulous dance of words and the narratives they reveal. Welcome to the fascinating world of Franco's New Year messages, where every sentence harbors a clue, and every analysis draws us nearer to the truth.
In my doctoral thesis, "In Franco's Words: Authorship Attribution of the New Year's Speeches," I employ sophisticated stylometric methodologies within the theoretical and practical framework of Digital Humanities to uncover the concealed authorship patterns behind Francisco Franco's annual addresses. Through meticulous computational analysis and linguistic examination, my research illuminates a historical narrative significantly more intricate, layered, and nuanced than conventional historical accounts have previously acknowledged or explored.
Research Objectives
The primary objective of this research is to definitively identify the authorship patterns within Francisco Franco's New Year messages through advanced stylometric techniques. By applying computational linguistics and statistical analysis, I aim to determine whether these speeches were solely authored by Franco or if they were collaborative works with significant input from other writers.
This investigation further seeks to establish a methodological framework for authorship attribution that can be applied to other historical political texts. Through the integration of digital humanities approaches with traditional historical analysis, this research bridges the gap between quantitative and qualitative methodologies in historical studies.
Additionally, this study endeavors to contribute to our understanding of political discourse during Franco's regime by uncovering the potential multiple voices behind these significant annual communications. By revealing the collaborative nature of political speechwriting, this research challenges the conventional historical narrative of singular authorial voice in dictatorial regimes.
Research Objectives
Tracing Verbal Profiles
Determining if there is a unique authorship or multiple contributors by analyzing the verbal profiles present in Franco's New Year messages.
Validating Stylometric Techniques
Testing the effectiveness of stylometric methods to identify unique writing patterns across different texts attributed to Franco.
Understanding Ideological Evolution
Analyzing the evolution of Franco's ideological-political stance through linguistic patterns in his speeches over time.
Methodology Overview
Establishing Research Questions
Formulating clear questions about authorship and stylistic consistency in Franco's messages.
Compiling and Preprocessing Corpus
Gathering all New Year messages and comparative texts, preparing them for analysis.
Quantitative Analysis
Applying statistical techniques including clustering and bootstrap methods.
Qualitative Analysis
Examining linguistic patterns and stylistic features to identify authorial fingerprints.
Historical Background of Authorship Attribution
Ancient Origins
Authorship attribution traces back to the Library of Alexandria, where scholars like Zenodotus and Aristarchus authenticated Homer's works.
Biblical Studies
Early attribution studies examined the authorship of biblical texts, questioning divine authorship and establishing canonical works.
Medieval and Renaissance
Lorenzo Valla pioneered new techniques in 1440 with his work on the Donation of Constantine, using language changes as evidence.
Modern Era
21st century attribution combines traditional methods with computational stylistics and quantitative style analysis.
The Library of Alexandria
Significant Bibliographic Collection
Founded by Ptolemy I, the Library of Alexandria became the most important collection of texts in ancient Greece.
First Research Groups
Distinguished scholars and librarians formed the earliest research groups dedicated to literary studies, including authorship attribution.
Homeric Authentication
One of the earliest authorship studies was conducted to authenticate the writings of Homer, particularly the Iliad and Odyssey.
Biblical Authorship Studies
Pentateuch Authorship
Research revealed that the Pentateuch was not entirely written by Moses as traditionally believed, challenging conventional attribution.
Jewish Old Testament
The books of the Jewish Old Testament were compiled after the destruction of the temple that housed them in 70 BCE.
New Testament Canon
The First Council of Carthage in 393 AD established the first canon of 27 books, examining their authenticity and apostolic origins.
Authorship Controversies
While the truthfulness of biblical texts was not questioned, controversy arose regarding which books should be accepted due to doubts about their apostolic origins.
Medieval and Renaissance Contributions
Evolution of Latin
Systematic changes in Latin demonstrated an evolution in language use
Lorenzo Valla's Techniques
New methods for verifying authorship introduced in 1440
The Donation of Constantine
Valla proved the document apocryphal through linguistic analysis
Before the 15th century, authorship attribution studies relied primarily on technical and material evidence from disputed documents. Lorenzo Valla's groundbreaking work "De Falsa et Ementita Constantini Donatione" in 1440 introduced new techniques for verifying authorship, using language changes as decisive evidence. This arose from a territorial dispute between King Alfonso V of Aragon and Pope Eugene IV. Valla's analysis showed that the document used terminology inconsistent with Constantine's era, proving it was apocryphal and leading the Holy See to remove it from circulation.
Modern Authorship Attribution
Traditional Methods
Traditional authorship attribution deduces the characteristics of an author by analyzing the features of their written documents. This approach relies on empirical investigation, discerning between internal and external evidence in a written text.
Empirical investigation
Analysis of internal evidence
Examination of external evidence
Non-Traditional Methods
Non-traditional authorship attribution, also known as stylometry or computational stylistics, uses modern techniques to automatically analyze documents through quantitative style analysis.
Statistical analysis
Computational linguistics
Machine learning algorithms
Biometric approaches
In the 21st century, authorship attribution has advanced significantly, incorporating both traditional and non-traditional methods to identify the true authors of texts with greater accuracy and reliability.
Stylometric Analysis Fundamentals
Authorial Fingerprint
Identifying unique writing patterns that distinguish one author from another
Statistical Methods
Using mathematical approaches to quantify stylistic features
Relative Vocabulary
Analyzing word frequency patterns as a measure of authorial style
Validation
Testing results against known samples to verify accuracy
Stylometric analysis aims to infer linguistic idiosyncrasies of an author mathematically from a statistical sum of various features from a group of texts. This method distinguishes authors significantly and consistently within a statistical group. The primary proposal in stylometry is using "relative vocabulary overlap" as a measure of frequency. Studies have shown that linguistic patterns in an author's writing style reaffirm the existence of an authorial fingerprint, making it possible to trace unique writing styles.
Challenges in Stylometry
Joseph Rudman's Critique (2000)
Rudman challenged the idea of a unique writing pattern, questioning the validity of attributing authorship based on distinctive stylistic features. He argued that the empirical verification of these features remains insufficient.
Barbara Johnstone's Counterpoint
Johnstone advocated for a focus on linguistic individuality, emphasizing the importance of self-expression and individual linguistic choices. She cited Edward Sapir's assertion that "the linguistic expression of the individual is a crucial factor in their identity."
Traditional vs. Modern Perspectives
While traditional models focus on group-based linguistic variations, modern perspectives advocate for recognizing individual linguistic choices and self-expression. This shift highlights the complexity of language and the need for a more nuanced approach to authorship attribution.
Individual Linguistic Expression
Morton's Brain Storage Theory
Andrew Q. Morton asserts that the brain stores unique experiences that influence writing style, stating, "The brain is the organ that stores human experiences, and through it, the author unconsciously uses their vocabulary composed of their collections or memories."
Foster's Memory Influence
D. W. Foster emphasizes the role of long-term memory in shaping an author's unique idiolect, suggesting that our linguistic choices are deeply influenced by our accumulated experiences.
Johnstone's Dynamic View
Barbara Johnstone proposes that "language changes and can manifest statistically by including individual, psychological, and rhetorical factors," highlighting the interaction between individual and social influences in shaping linguistic variation.
Historical Evolution of Stylometry
1
1851
Augustus De Morgan proposes analyzing word lengths to compare authorship.
2
1887
T.C. Mendenhall's study on Shakespeare popularizes stylometry by suggesting that word length could distinguish authors.
3
1897
Wincenty Lutosławski provides evidence in a dispute over the authorship of Plato's Dialogues.
4
20th Century
Development of methods like Burrows' Delta, Hoover's Zeta and Iota, and Eder's Bootstrap.
5
21st Century
Integration of machine learning algorithms, neural networks, and genetic algorithms.
Key Stylometric Techniques
Burrows' Delta
This method measures the distance between texts based on word frequencies, creating a mathematical representation of stylistic similarity.
Hoover's Zeta and Iota
Hoover's methods focus on rare words (Zeta) and common words (Iota), providing complementary approaches to stylistic analysis.
Eder's Bootstrap
This method generates random permutations of data to improve reliability and reduce the impact of outliers in stylometric analysis.
Analysis of Principal Components
A statistical technique that reduces the dimensionality of data while retaining most of the variation, allowing for visualization of stylistic differences.
Cluster Analysis
Groups texts based on stylistic similarities, creating dendrograms that visually represent relationships between texts.
Notable Applications of Stylometry
Jane Austen's Works
Burrows applied Delta to Jane Austen's works and successfully attributed Juvenal's translations.
Federalist Papers
Frederick Mosteller and David Wallace resolved the Federalist Papers' authorship controversy using stylometric techniques.
Cassandra Letters
David Holmes identified the author of the Cassandra Letters through detailed stylistic analysis.
Modern Software
Programs like JGAAP and 'stylo' in R have made stylometry accessible to a broader audience.
Criticisms of Stylometry
3
Major Critiques
Primary challenges to stylometric methodology
1851
Years of Debate
Ongoing discussions since De Morgan's early work
100+
Variables
Potential linguistic features that can be analyzed
Joseph Rudman critiqued stylometry for lacking theoretical foundations and being overly dependent on arbitrary variable selection. The "cherry picking" problem, where researchers selectively choose data to support their hypotheses, remains a significant concern in the field. Critics argue that the selection of which linguistic features to analyze can dramatically affect results, potentially leading to confirmation bias.
Modern Advancements in Stylometry
Neural Networks
Advanced pattern recognition for stylistic analysis
Machine Learning Algorithms
Naïve Bayes classifiers and support vector machines
Genetic Algorithms
Generation and refinement of syntactic patterns
Accessible Software Tools
JGAAP and 'stylo' in R programming language
Understanding Idiolect
Personal Linguistic Fingerprint
Idiolect refers to an individual's unique use of language, shaped by personal experiences, cognitive development, and social influences.
Cognitive Development
According to Herman (cited in Weinreich, Labov, & Herzog, 1968), "there are as many languages as there are individuals in the world," underscoring the notion that each person crafts their own idiolect.
3
3
Neural Uniqueness
Even identical twins, despite sharing genetic makeup, exhibit distinct linguistic patterns due to variations in neural development (Gage & Muotri, 2012).
Forensic Applications
In forensic linguistics, idiolect is instrumental in identifying distinct stylistic features that can attribute a text to a specific author.
Idiolect in Authorship Attribution
1
Conscious and Unconscious Choices
McMenamin (2010) posits that linguistic style manifests through both spoken and written language, influenced by the writer's conscious and unconscious choices.
2
Markers for Analysis
These choices, embedded in an individual's idiolect, serve as markers for authorship analysis, allowing researchers to identify distinctive patterns.
3
Language Variation Insights
Johnstone (1997) argues that the study of idiolect offers a nuanced approach to understanding language variation, revealing the interplay between personal expression and social factors.
4
Robust Attribution
This perspective is essential for accurate and robust authorship attribution, providing a foundation for identifying the true author of disputed texts.
Analysis Using Stylo Software
First Phase
Compared New Year's Messages (Corpus A) and Franco's "Testamento" (Corpus B) using the "stylo" software. Analyzed the 50, 100, and 350 most frequent words (MFW) and applied Eder's Delta method to cluster the texts.
Second Phase
Expanded the analysis to include all texts in Corpus A and B. Results remained consistent, reinforcing the hypothesis that Franco authored specific messages while others were likely written by different hands.
Third Phase
Included Corpus C in the analysis. Used "stylo" with parameters of 50, 100, and 350 MFW and Eder's Delta, observing clustering patterns that indicated multiple authors in the New Year's Messages.
First Phase Results
The results showed a close association of Franco's "Testamento" with the messages from 1937, 1938, 1946, and 1947, suggesting Franco's authorship of these specific messages. The similarity scores represent the stylistic closeness between Franco's known writing and the New Year messages.
Second Phase Findings
The second phase expanded the analysis to include all texts in Corpus A and B. The results remained consistent with the first phase, reinforcing the hypothesis that Franco authored the messages from 1937, 1938, 1946, and 1947, while other messages were likely written by different hands. The visualizations above show different representations of the stylometric analysis results.
Third Phase Analysis
Expanded Corpus
In the third phase, we included Corpus C in the analysis, which contained additional texts from potential collaborators and contemporaries of Franco.
This expansion allowed us to test the hypothesis that multiple authors contributed to the New Year's Messages over the years of Franco's rule.
Consistent Parameters
We maintained consistent analytical parameters, using the "stylo" software with settings of 50, 100, and 350 most frequent words (MFW) and applying Eder's Delta method.
This methodological consistency ensured that our results across different phases could be reliably compared and integrated.
Clustering Patterns
The analysis revealed distinct clustering patterns that strongly indicated the involvement of multiple authors in the New Year's Messages.
The consistent grouping of certain messages with Franco's "Testamento" confirmed his authorship of those specific years (1937, 1938, 1946, and 1947).
Key Conclusions from Stylo Analysis
Franco's Authorship Confirmed
There is a high probability that Franco authored the New Year's Messages of 1937, 1938, 1946, and 1947.
Multiple Contributors Identified
Other messages were likely written by different authors, suggesting multiple contributors over the years of Franco's rule.
Further Research Potential
Additional analysis could investigate the authorship of other texts traditionally attributed to Franco.
The findings of this study provide a clearer understanding of the authorship of Francisco Franco's New Year's Messages and demonstrate the effectiveness of stylometry in forensic linguistics. The consistent results across multiple analytical phases strengthen our confidence in these conclusions.
Likelihood Ratio (LR) Analysis
Statistical Framework
The Likelihood Ratio (LR) is a statistical framework employed to compare competing hypotheses in forensic text analysis.
This approach has gained widespread acceptance due to its quantitative nature and use of statistical models, which enhance transparency, reproducibility, and resilience against cognitive biases.
Key Elements
Quantitative Measurements: Utilizes numerical data to assess hypotheses
Statistical Models: Applies statistical paradigms to analyze evidence
Likelihood Ratio: Represents the strength of evidence for one hypothesis over another
LR Formula
LR = p(E|Hp) / p(E|Hd)
In this equation, LR is the ratio of the probability of evidence given the hypothesis of interest (Hp) to the probability of evidence given the competing hypothesis (Hd).
Application of LR to Franco's Texts
The LR framework was applied to compare the New Year's Message of 1973 attributed to Francisco Franco with texts authored by Ramón Serrano Suñer (RSS). The resulting LR value was 1.63459933607112, indicating that the hypothesis that Franco authored the 1973 message is 1.63459933607112 times more probable than the competing hypothesis that Suñer authored the message.
Interpretation of LR Results
Franco vs. Suñer
The LR value of 1.63459933607112 indicates that the probability that Franco authored the 1973 message is significantly higher than the probability that Ramón Serrano Suñer authored it.
This supports the hypothesis that Francisco Franco was indeed the author of the 1973 New Year's Message.
Comparison with Other Candidates
When applying the LR framework to texts by other potential authors, such as Luis Suárez, the analysis yielded lower LR values (e.g., 0.837556396347176).
These comparatively lower values further reinforce the likelihood that Franco authored the message rather than these alternative candidates.
Methodological Strength
The LR method provides robust support for determining authorship in forensic linguistic studies, confirming its efficacy in analyzing historical texts.
The quantitative nature of this approach enhances the reliability and objectivity of the authorship attribution process.
Linguistic Inquiry and Word Count (LWIC)
Comprehensive Analysis
Categorizes words into 100 different groups
Psychological Dimensions
Reveals cognitive and emotional states
Social Context
Identifies social dynamics in language
Grammatical Structure
Analyzes function words and linguistic patterns
The Linguistic Inquiry and Word Count (LWIC) tool enables a detailed examination of the language used in a given text, categorizing words into 100 different groups that encompass grammatical, psychological, social, and emotional dimensions. This computational tool is highly valued for its effectiveness in revealing the underlying psychological and social contexts of written communication.
Function Words in Authorship Analysis
Revealing Cognitive States
Function words (e.g., pronouns, articles, prepositions) are particularly revealing of cognitive and psychological states in text analysis.
Stability Over Time
These words remain stable over time and are less susceptible to change compared to content words, making them reliable markers for comparing texts from different periods.
Authorship Markers
In historical text analysis, such as the evaluation of Francisco Franco's New Year Messages, function words have proven effective in identifying the author's identity.
Pronoun Analysis in Franco's Texts
Pronouns are examined in relation to their reflection of social hierarchy and power dynamics. High-status individuals tend to use first-person pronouns less frequently, indicating a broader social focus, while lower-status individuals use them more frequently, displaying a self-focused perspective. The study identified a significant reduction in the use of the pronoun "yo" (I) in Franco's messages, with an overrepresentation of "nosotros" (we), which aligns with the linguistic profile of a political leader in control.
Analytical Thinking Assessment
78%
Articles & Prepositions
Higher frequency indicates analytical thinking
45%
Narrative Style
Lower frequency suggests intuitive approach
3.2
Cognitive Complexity
Average score in Franco's authenticated texts
Beyond addressing individual pronoun use, LWIC assesses the overall analytical thinking and cognitive mechanisms of the author by quantifying the use of articles and prepositions. Higher frequencies of these function words indicate analytical thinking, while lower frequencies suggest a more narrative, intuitive style. The analysis highlighted in the study illustrates the distinctive cognitive profiles among various texts attributed to Franco and emphasizes the importance of function words in establishing these differences.
Language Style Matching (LSM)
Synchronization Measurement
Reflects the alignment between texts in function word usage
Similarity Detection
Identifies stylistic connections between different documents
3
3
Authorship Connections
Suggests potential collaborative relationships between writers
Validation Tool
Provides additional evidence for authorship attribution
Language Style Matching (LSM) reflects the synchronization between texts in terms of function word usage. This metric is crucial for detecting similarities and differences in writing styles, suggesting authorship connections. The study found that Franco's texts exhibited high levels of LSM with those of certain candidates, particularly Luis Suárez Fernández and Ramón Serrano Suñer, reinforcing the hypothesis of their involvement in writing the messages.
LSM Results with Potential Co-authors
The study found that Franco's texts exhibited high levels of LSM with those of certain candidates, particularly Luis Suárez Fernández and Ramón Serrano Suñer, reinforcing the hypothesis of their involvement in writing the messages. These high similarity scores suggest potential collaboration or influence in the creation of the New Year's Messages attributed to Franco.
Key Findings: Authorship of Specific Messages
1937 Message
Strong evidence of Franco's authorship based on stylistic similarity with his Testament
1938 Message
Consistent stylistic markers indicating Franco's personal writing style
3
1946 Message
Stylometric analysis confirms high probability of Franco's authorship
4
1947 Message
Linguistic patterns closely match Franco's authenticated writings
The analysis revealed that Francisco Franco himself wrote the New Year Messages for the years 1937, 1938, 1946, and 1947. This conclusion is supported by the strong stylistic similarity between these messages and Franco's Testament, which is considered an authenticated text written by Franco himself.
Evidence of Multiple Authorship
1
Dispersal in Dendrograms
The results indicate that not all messages were written solely by Franco. The dispersal of the messages in the dendrograms suggests the involvement of other individuals in their creation.
2
Linguistic Style Matching
LSM results provide evidence of potential collaboration, particularly with figures like Ramón Serrano Suñer, who showed high stylistic similarity with certain messages.
3
Stylistic Inconsistencies
Analysis of function words and cognitive markers revealed significant variations across the corpus of messages, indicating different authorial voices.
4
Temporal Patterns
Messages from different time periods showed distinct stylistic clusters, suggesting changes in authorship or collaborative patterns over time.
Pronoun Usage Analysis
First-Person Singular ("yo")
The study found a notable reduction in the use of the first-person singular pronoun "yo" (I) in Franco's messages compared to typical political discourse of the era.
This reduced self-reference is consistent with the linguistic profile of high-status political leaders who often de-emphasize individual identity in favor of collective representation.
First-Person Plural ("nosotros")
Analysis revealed an increased use of the first-person plural "nosotros" (we) in Franco's authenticated messages.
This overrepresentation of collective pronouns reflects a leader's persona, emphasizing national unity and collective identity rather than personal authority.
Implications for Authorship
The distinctive pattern of pronoun usage serves as a reliable marker for identifying texts genuinely authored by Franco versus those written by collaborators.
Messages with significantly different pronoun distribution patterns likely indicate different authorship or substantial editorial intervention.
Triangulation of Research Methods
Cluster Analysis
Supported the authorship of Franco for the 1937, 1938, 1946, and 1947 messages through hierarchical clustering of stylistic features.
Likelihood Ratio (LR)
Reinforced the cluster analysis findings, providing statistical support for Franco's authorship of certain messages.
Language Style Matching (LSM)
Although it provided weaker evidence, it indicated possible collaboration, particularly with Serrano Suñer.
4
4
Consistency Validation
The convergence of results from multiple methods strengthened the overall reliability of the findings.
Limitations of the Study
Tool Dependence
The results heavily depend on the tools (stylo, LIWC) used, which have their own limitations and assumptions that may affect the analysis.
Sample Size Variability
The variability in the size of text samples could have influenced the results, potentially skewing the stylometric analysis.
Subjective Interpretation
Interpreting dendrograms and LSM analysis requires subjective judgment, which may not always yield irrefutable conclusions.
Hypothesis Dependence
The inclusion of Franco's Testament as an undoubtedly authored text is based on a hypothesis that could be challenged, affecting the entire analysis.
Implications for Historical Research
Reinterpreting Franco's Legacy
Understanding the true authorship of Franco's messages provides new insights into his leadership style and the power dynamics within his regime.
Collaborative Governance
Evidence of multiple authorship suggests a more collaborative approach to governance than previously understood.
Political Discourse Analysis
The study sets a precedent for analyzing political discourse through computational linguistics.
Academic Methodology
Demonstrates the value of digital humanities and computational stylistics in historical research.
Future Research Directions
Expanded Corpus Analysis
Include additional texts attributed to Franco and his contemporaries to strengthen the comparative base and refine authorship attribution.
Advanced Machine Learning
Apply newer AI and machine learning techniques to detect more subtle patterns in authorship that current methods might miss.
Co-authorship Networks
Investigate the relationships between Franco and his potential collaborators to understand the dynamics of speech writing in his regime.
Cross-linguistic Comparison
Compare Franco's linguistic patterns with those of other contemporary dictators to identify common features of authoritarian discourse.
Potential Co-authors Identified
Ramón Serrano Suñer
Franco's brother-in-law and close advisor showed high stylistic similarity with certain messages, particularly in later years. The LSM analysis revealed significant linguistic overlap between Suñer's known writings and several of Franco's New Year addresses.
Luis Suárez Fernández
The historian and Franco biographer exhibited linguistic patterns that closely matched some of the messages. His academic background may have influenced the more analytical style found in certain addresses.
Other Regime Officials
The analysis suggested that various other officials within Franco's government likely contributed to the messages, though their specific identities could not be conclusively determined from the available evidence.
Methodological Contributions
Integration of Multiple Tools
Combined stylo, LIWC, and statistical analysis
Triangulation Approach
Used multiple methods to verify findings
Historical Context Integration
Connected linguistic analysis with historical events
This study makes significant methodological contributions to the field of authorship attribution by demonstrating the effectiveness of combining multiple analytical approaches. The integration of stylometric analysis with linguistic inquiry and statistical methods provides a more robust framework for identifying authorship in historical texts. The triangulation approach, which verifies findings through multiple independent methods, enhances the reliability of the conclusions and establishes a model for future research in digital humanities and computational stylistics.
Evolution of Franco's Political Discourse
1
1937-1938
Early messages show more direct, military-influenced language with stronger personal voice, confirmed to be written by Franco himself.
2
1939-1945
Wartime messages display shifting authorship patterns, with evidence of collaborative writing during World War II period.
3
1946-1947
Post-war messages return to Franco's personal style, focusing on Spain's position in the changing international landscape.
4
1948-1960s
Economic development era shows increased influence from technocrats and advisors in message composition.
5
1970s
Late-period messages reveal significant collaborative authorship, with Franco's direct input diminishing.
Significance for Digital Humanities
Computational Approach to History
This study demonstrates how computational methods can reveal new insights into historical texts that traditional analysis might miss.
Open-Source Tools
The use of accessible tools like stylo in R programming language shows how digital humanities can democratize advanced research techniques.
Interdisciplinary Collaboration
The research bridges linguistics, history, statistics, and computer science, highlighting the value of interdisciplinary approaches.
Data-Driven Historical Analysis
The study sets a precedent for using data-driven methods to answer complex historical questions about authorship and political communication.
Summary of Key Findings
The research conclusively identified Franco as the author of the 1937, 1938, 1946, and 1947 New Year Messages, with strong evidence suggesting his significant contribution to several others. However, the majority of messages appear to have been collaborative efforts or written entirely by other individuals. The stylometric analysis revealed distinct patterns of authorship across the corpus, with function words and pronoun usage serving as particularly reliable markers for attribution.
Conclusions and Future Perspectives
Multi-faceted Authorship
The authorship of Franco's New Year Messages is multi-faceted. While Franco undoubtedly wrote some of them (1937, 1938, 1946, and 1947), others were likely written with the collaboration of individuals like Ramón Serrano Suñer and other regime officials.
This finding challenges the traditional view of Franco as the sole voice behind these important political communications and reveals a more complex picture of how his regime crafted its public image.
Methodological Value
This study highlights the value of digital humanities, computational stylistics, and stylometry in literary and historical research. The combination of multiple analytical approaches provided robust evidence for authorship attribution.
The triangulation of results from cluster analysis, likelihood ratio calculations, and language style matching strengthened the reliability of the findings and demonstrated the effectiveness of interdisciplinary methods.
Future Directions
Further research is needed to precisely identify the co-authors and resolve remaining uncertainties. Expanding the corpus to include more texts from potential collaborators could provide additional insights.
This study sets a precedent for future investigations into political discourse and authorship attribution, particularly in authoritarian regimes where the true voices behind official communications may be obscured.