TEXUS: A unified framework for extracting and understanding tables in PDF documents