Code for Berlin's cover image

Beyond PDF: Extracting Continuous Text with Machine Learning

Hosted by Code for Berlin

Tweet Share

We are looking forward to welcoming you all to our next remote Open Knowledge Lab Berlin!

As before, we will use the BBB instance of our friends in Ulm: https://bbb.ulm.dev/b/ok-lab-berlin.

Time time, Johannes Filter gives an overview on PDF text extraction and presents his new tool pd3f. pd3f reconstructs the original continuous text with the help of machine learning.

If you have any questions do not hesitate to contact us beforehand. Just send us an e-mail to berlin@codefor.de and we will get back to you shortly!

Time:
Sept. 14, 2020, 7 p.m. - Sept. 14, 2020, 9 p.m.
Place:
No place selected yet.

Comments

Attendees (1)

k-nut Yes
Host

Photos