Welcome to the Image Understanding and Pattern Recognition (IUPR) research group (aka AG Breuel) at the Computer Science Department of the University of Kaiserslautern (search) and the German Research Center for Artificial Intelligence (DFKI) (search).
Our research group conducts basic and applied research in pattern recognition, machine learning, image understanding, and artificial intelligence, with practical applications to digital libraries, network security, bioinformatics, historical document analysis, and scientific data analysis. To learn more about us, have a look at our Research Themes, Projects, and Publications.
You can try out some of our software in online demos, download open source packages, and browse our documentation.
Document analysis deals with the visual and geometric analysis of document images. The goal is to recover textual content, geometric structure, and logical structure. With the recent resurgent interest in digital libraries and large-scale scanning operations by organizations like Google, Microsoft, and the Internet Archive, document analysis has become a very important real-world problem again. We are addressing document analysis at all levels: camera-based document and book capture, OCR and handwriting recognition, document retrieval, and document enhancement. In addition to its practical applications, document analysis is also an important test cases for more general computer vision and machine learning algorithms due to the availability of large amounts of correctly ground-truthed data.
OCR and Layout Analysis: OCRopus project OCRopus demo page layout analysis demo
Camera-Based Document Capture: OSCAR camera-based document capture demo document dewarping demo
Content Analysis and Information Extraction: appearance-based document retrieval demo bibliographic reference recognition demo
Computing for the Humanities: historical document analysis/comparison demo
Additional demos are listed on the IPeT Demo Page
For a general overview, please see The OCRopus Open Source OCR System
Paper-based documents are widely used for identification, authentication, and legal purposes. Forgery of such documents is a major component of insurance fraud, immigration fraud, tax evasion, and other white collar crime. Although optical security measures like holograms and special paper are partial solutions in areas such as currencies and passports, they are expensive and are not applicable when the creation of the document is not under the control of the organization needing to verify the documents. We are developing techniques that allow the authenticity of ordinary paper documents to be verified using optical techniques.
Large amounts of images and videos are captured in numerous context: consumer digital imaging, surveillance, industrial inspection, satellite imagery, astronomy, and many other areas. We are applying image processing, pattern recognition, and machine learning techniques to problems such as the detection of anomalous behaviors, defect detection in industrial inspection, quantitative analysis of large amounts of astronomical image data, and media asset management.
Please find more information on our new website: http://sites.google.com/a/iupr.com/iupr-network-analysis/
Network security currently relies largely on systems techniques like secure protocols and rule/pattern-based methods. We are applying statistical, decision theoretic, pattern recognition, and machine learning techniques to the automated and adaptive identification and remediation of distributed denial of service (DDoS) attacks and intrusion attempts.
DDoS attacks are one of the most threatening assaults on the Internet today. Servers are flooded with a tremendous number of nonsense requests from thousands of clients in order to cause a server overload or even crash. Usually, a single attacker controls a powerful (bot) network of Trojan horse infected PCs and let them attack a web service simultaneously without the knowledge of the PC owner. DDoS attacks seriously harm e-businesses such as web shops, online auctions, online banking or simply cause an image loss of a company.
The fact, that the requests origin from computers all over the world and might even look like legitimate request messages makes it very hard to filter them or firewall them in a classical way.
Nevertheless, the requests are machine generated and not initiated by a human. Our new approach to detect and prevent DDoS attacks claims now to detect anomaly patterns which are a result of these machine generated packets. Therefore, we use pattern recognition methods to determine and filter the non-legitimate packets based on multiple parameters, such as routing information, origin networks, coherences on document structures and many others.
Most classifiers in common use for tasks like speech recognition, handwriting recognition, and OCR are trained once on a large dataset and then deployed; some classifiers additionally may attempt to perform some kind of adaptation to specific inputs. In our work, we are concerned with long-term learning and adaptation to input data. Several components play into this: learning from unlabeled or weakly labeled training data, learning from small numbers of training examples, active learning, style and user modeling, and statistically valid long-term modifications of models.
Our group has solved a number of difficult geometric problems using combinations of interval arithmetic and branch-and-bound optimization. The resulting algorithms are often far simpler and more general than traditional algorithms in computational geometry, yet make precise numerical guarantees.
Evaluation of pattern recognition and machine learning algorithm is commonly carried out on “real data”. While this seems appealing and challenging at first glance, it yields fairly little information about how well a method generalizes to other problems or how robust it is. We are developing new statistical methods, evaluation methods, and validation methods for experimentally characterizing machine learning algorithms, and for diagnosing problems in large, complex pattern recognition systems.