Semi-Assisted Platform for Intelligent Historical Transcription (SIHT):

Revolutionizing the Digitalization of Historical Documents

The Semi-Assisted Platform for Intelligent Historical Transcription (SIHT) represents a pioneering breakthrough in the transcription of historical documents, fusing state-of-the-art artificial intelligence (AI) with expert human interaction. This hybrid approach not only optimises efficiency and accuracy in the transcription of ancient documents, but also redefines the large-scale digitisation process. 

  1. Key features of SIHT
  2. Innovation and technology with SIHT: The future of data management and computer vision
  3. Impact and benefits of SIHT
  4. Production and delivery of the Historical Transcription Service
  5. Quality control at every stage
  6. Who can use it?
  7. Our team
  8. Pricing policy and social commitment
  9. Circular economy is part of our values
  10. Contact us

Key features of the SIHT

  • Advanced Artificial Intelligence: We use state-of-the-art algorithms in natural language processing and computer vision, capable of recognising and transcribing historical manuscript texts accurately.
  • Intuitive and user-friendly interface: Our platform provides a seamless user experience, allowing automatic transcriptions to be uploaded, viewed and corrected with ease and efficiency.
  • Rich collaborative editing: We provide a collaborative space where historians and palaeography experts can contribute their knowledge, thus improving the quality of transcriptions.
  • Detailed revision record: We maintain a comprehensive record of all modifications, ensuring transparency and ease of monitoring and document auditing.
  • Continuous AI learning: Our AI is constantly being perfected, adapting and improving its accuracy with every user interaction and feedback.
  • Project management tools: We offer solutions for the efficient management of large-scale transcription projects, ensuring optimal organisation and distribution of documents.
  • Export and secure storage: We provide powerful options for the export and secure storage of transcripts and original documents.
  • Maximum security and data protection: We are aware of the confidentiality of historical documents and guarantee the strictest security and data protection measures.

Innovation and technology with SIHT: The future of data management and computer vision

We work at the forefront of computer vision technology, a discipline that processes and interprets digital images using computer programmes inspired by human visual models. Our approach integrates various disciplines such as artificial intelligence, mathematics, optics and signal processing to offer advanced solutions in document digitisation and analysis.

Development of Artificial Intelligence tools
  • Specialization in document analysis: We focus on improving intelligent systems for reading scanned documents, especially in the field of historical manuscripts with unique challenges such as physical degradation, complex handwriting styles and variability in calligraphy.
  • Innovation in information extraction: We combine deep learning techniques with semi-assisted systems to extract information from handwritten tax documents, overcoming the limitations of traditional OCR systems.
Data management with MySQL
    • Relational database management system: We use MySQL, an open-source database management system, recognised for its efficiency, scalability and versatility, adapting to a wide range of needs and computing environments.
    • Key advantages of MySQL:
    1. Multiplatform: Availability on multiple operating systems.
    2. High performance and efficiency: Effective optimisation for large data sets.
    3. Scalability: Adaptability to growing database needs.
    4. Compatibility with programming languages: Integration with PHP, Python, Java, C++, among others.
    5. Advanced features: Support for stored procedures, triggers, replication and data encryption.
    6. Active community: A group of developers and users who offer continuous support and collaboration.

    Impact and benefits of SIHT

    • High efficiency: We reduce transcription time significantly. What might traditionally take months, is done in days thanks to the integration of AI.
    • Accuracy and quality: We combine AI efficiency with human accuracy, ensuring high quality transcripts.
    • Institutional support: We are financially supported by Generalitat de Catalunya for technological innovation projects in co-operativism led by women (2023-24), and we collaborate with the prestigious Computer Vision Centre of the Universitat Autònoma de Barcelona (CVC).

    Academic and scientific impact

    • Doctoral thesis supervision: We have supervised 18 doctoral theses and final degree projects at the UAB (Universitat Autònoma de Brcelona) and UPF (Universitat Pompeu Fabra), covering diverse topics such as climate change, agricultural co-operativism, social economy and public policies.
    • Participation in R&D&I projects: We are involved in 11 R&D&I projects funded in competitive calls, demonstrating our capacity in applied research and technological development.
    • Renowned scientific publications: Our team has published more than 22 books, publications and scientific and technical papers, some of them with high impact factor journals.
    • Presence at national and international conferences: We have given 24 presentations at conferences between 2014 and 2023, highlighting our active participation in the international scientific community.

    Historical Transcription Service Production and Supply

    The production and supply of the service relies on a combination of advanced technology and human expertise to ensure the accuracy and quality of the historical transcriptions.

    Collaboration between AI algorithms and human experts is critical to the success of the platform. The production and service supply of the

    Semi-Assisted Historical Transcription Intelligent Platform (SIHT) involves a number of key components and processes.

    The platform starts with a personalised service to organizations interested, either on demand or through workshops, seminars, presentations, web or social media. In any case, once a firm agreement is reached on the basis of a proposal, agreement or public tender, the service consists of the following phases:

    Our services:

    1. Digitisation of historical documents: we start with high-resolution digitisation, using specialised equipment to capture sharp images of historical documents, ensuring high fidelity and quality. Every step, from screening to secure storage, is done with meticulous attention to retail.
    2. Uploading documents to the platform: Digitised documents are uploaded to SIHT through an intuitive interface. We support various file formats and provide options for efficient organisation and data protection.
    3. Image processing: We apply advanced processing techniques to improve clarity and correct flaws in the images, preparing them for transcription.
    4. Automated transcription: We use AI and NLP algorithms to perform an initial transcription, transforming the handwritten text into digital data (specific details are kept confidential for research reasons).
    5. Human validation and correction: Experts in palaeography review and correct the initial transcription, ensuring accuracy and preservation of the original format.
    6. User-friendly interface: Our platform makes it easy to upload documents, review transcripts and collaborate on projects.
    7. Secure storage and access: We ensure secure storage of transcribed documents and originals for future access and retrieval.
    8. Scalability and ongoing support: We design the platform to handle large volumes of documents and provide ongoing support to our users.
    9. Data protection and security: We implement rigorous security measures to protect the privacy and ownership of documents.

    Quality control at each stage

    We keep strict quality control at every stage of the service to ensure accurate and high-quality transcriptions. From visual inspection prior to scanning to final verification of the transcript, each step is subject to rigorous review and continuous improvement based on user feedback.

    Who can use it?

    Our service is designed for a wide range of users, including researchers, cultural institutions, students, teachers, genealogists and the general public. We identify and address the specific needs of each segment to ensure that our platform is a valuable and accessible tool for all of them.

    Our team

    We have a multidisciplinary team of highly qualified professionals in science and technology. Our co-operative, which is also oriented towards carrying out scientific and technological projects with a high cultural and environmental impact, is made up of people with extensive experience in the national and international technological field.

    Team achievements and awards
    • Innovative awards: Our founding team has been recognised with the first prize of the Diputació de Barcelona for Best Start-up in 2005, excelling in e-marketplace projects.
    • Pioneering projects: We have participated in innovative projects such as Undertile (2013) in digital memory, awarded with an ENISA, and the SSoT project in smart sensors for adaptation to climate change.
    Collaboration with the Computer Vision Centre (CVC) - Universitat Autònoma de Barcelona.
    • Leadership in research: We incorporate the prestigious CVC, a world reference in advanced research in computer vision, with more than 130 professionals.
    • Academic and scientific excellence: Founded in 1995 by Generalitat de Catalunya and the UAB, the CVC is a leader in the transfer of technological knowledge to industry and society and in the training of high-level professionals.
    Research group on reading systems and document analysis.
    • European benchmark in handwriting recognition: Specialised in historical texts, this group carries out fundamental research at the frontier between vision, language and reading systems.
    • Innovation and global leadership: With more than 25 years of experience, the group is one of the most influential worldwide, with more than 180 publications and 10,150 citations in the last 5 years.
    • Projects with international impact: We lead and participate in competitive research projects at national and European level and collaborate with international laboratories and private companies.
    • Contributions and leadership in international organizations: Some members of our team play leading roles in international technical committees and have actively participated in major conferences in document analysis and computer vision.
    Key contributions and collaborations
    • Preservation of historical heritage: We focus on improving intelligent systems for reading textual and graphical documents, contributing significantly to the preservation and dissemination of historical and cultural heritage.
    • Projects of outstanding importance: We have been collaborating for more than 20 years with the Centre for Demographic Studies in major projects such as the ERC Advanced Grant Five Centuries of Marriages and the Recercaixa projects EINES I XARXES (TOOLS and NETWORKS).
    • Innovation in historical social networks: We have developed the Xarxa social històrica (Historical Social Network), a project that allows to visualise a historical social network graphically through a web platform (http://dag.cvc.uab.es/xarxes/), extracting data massively from demographic manuscripts.
    • Technology transfer: We have experience in technology transfer for companies and public institutions, working on systems for reading manuscript documents, historical photographs, music scores, among others.

    Pricing policy and social commitment

    At L’Àgora de la Terra we are committed to a non-profit business model, which means that our main motivation is to achieve a significant social impact while maintaining economic sustainability. Our pricing policy reflects this commitment: we focus on remunerating our members’ work fairly and, in the event of surpluses, these are reinvested in the co-operative to fulfil our social purpose. We understand that, as we receive public support, it is our responsibility to return to society through a fair price, based only on the cost of the work done.

    Advantages offered by our Semi-Assisted Platform for Intelligent Historical Transcription (SIHT)
     

    • Efficiency and accuracy: SIHT is a revolutionary innovation that combines artificial intelligence with human expertise to transcribe historical documents efficiently and accurately, facilitating preservation and access to the past.
    • Speed and accuracy: Our AI technology automatizes the initial transcription, allowing users to quickly transcribe large volumes of historical writing while maintaining high accuracy.
    • Universal access: Our platform allows easy access to historical documents, unlocking valuable information previously inaccessible, contributing significantly to heritage research and preservation.
    • Collaboration and continuous improvement: We promote active collaboration between users, enabling constant revision and improvement of transcriptions.
    • Customisation and scalability: SIHT is adaptable to the needs of a wide range of users, from historians to cultural institutions, and is scalable for projects of any size.
    • Ongoing support: We offer exceptional support to ensure the quality of your operation, with a team that is always available to assist at every step of the project.
    • Security and data protection: We prioritise data security, ensuring comprehensive protection of historical documents and confidential data.
    • Positive social impact: Our work in the transcription and preservation of historical documents contributes to the preservation of cultural heritage and the advancement of historical research.
    • Versatility of application: Our platform adapts to various fields, from academic research to genealogy and publishing.

    Circular economy is part of our values

    We incorporate circular economy principles to maximise efficiency and sustainability. From digitisation and storage to reuse and recycling of data, we strive to minimise environmental impact and promote collaboration and knowledge sharing.

    Contact

    For further information on SIHT to transform your document management and preserve cultural heritage, please contact us at hola@agoradelaterra.cat

     

    Fill in the following contact form: