Technology

A technology-centric approach to mass customization
The technology vision of BiblioLabs is to allow institutions to collect and digitally capture any content in any location, and to be able to process that content efficiently, in a flexible manner that supports any output format (BiblioLabs is format agnostic), and allows partners to customize the final product to meet their needs. To achieve this vision, the Company developed a hardware/software architecture that combines decentralized media aggregation with a centralized end-to-end software system, Global Book Manager (GBM) to manage the creation, marketing, enrichment and distribution of content.

BiblioLabs Global Book Manager Application Suite
The Company’s Global Book Manager (GBM) application suite represents an end-to-end enterprise software system that is centralized inside a secure, scalable data center, where banks of computers can work on the content creation process 24 hours a day. The vision for GBM is simple: (a) Take any type of media in any digital format; (b) Create finished products in products in a variety of formats (print or electronic); (c) Allow the books to be customized and/or branded by retail partners; (d) Allow communities the ability to enrich the information about the media (descriptions, biographies, reviews); and (e) Make the final product available in as many distribution channels as possible.

The GBM application suite database and storage engines are based primarily on an open-source service oriented architecture. GBM was architected as a scalable (ability to dynamically add processing nodes in a decentralized manner), modular (add and reuse software objects to add new application functionality), and extensible. Leveraging the distributed processing paradigm to maximize application performance, the system was designed to be manageable from any system with an Internet connection and directly integrated with systems from BiblioLife content partners, outsource quality assurance vendors, strategic retail partners and print-on-demand fulfillment partners. Because the system is “live” 24 hours a day, 7 days a week, BiblioLabs has built the GBM platform on a scalable, distributed and redundant hardware architecture to ensure system reliability and availability.

Perhaps the most technically challenging aspect of the GBM application suite was building the Core Processing Engine, a sophisticated application that automatically detects different source file attributes and applies appropriate processing parameters.  For example, the application is able to automatically detect certain image types (graphics versus text) within the same source file and process accordingly. For example, while graphic image are automatically processed using tonal curves, gamma correction and descreening, text pages are automatically converted to bi-tonal (to ensure font sharpness) and a median filter is applied to ensure text enhancement. All pages are processed using certain common functions such as deskewing, dewarping, alignment and despeckling. The result is a massively distributed application that automatically creates extremely high quality output from digital images of varying attributes.

Similar to traditional enterprise applications such Enterprise Resource Planning or Customer Interaction Management, the GBM software application was designed to address the end-to-end management of the creation, customization and distribution of printed media. The Company plans to build enhancements to GBM in areas such book customization (custom forewords, inscriptions, covers, advertising, etc.), on-line analytic tools for general reporting and decision support, as well as API (application programming interface) enhancements to better support strategic retail partners.

The GBM application and database architecture was built on the premise that it could blindly process or reject native digital media files, without prior knowledge of the size, file type or quality of the incoming content. Where a completely automated nirvana was not achievable, the application process provided an automated structure that significantly limited any minimal required human intervention (GBM QA Manager). The architecture dictated that all software components be self-contained re-usable objects, with each object running as a stand-alone application “service” that could be turned on or off on any given computer system (or even multiple application instances on the same system), thereby maintaining maximum throughput, efficiency and scalability.

The result of the GBM application development initiative has been a comprehensive application workflow that not only provides a complete end-to-end solution, but does so in a manner that allows for minimal file transfer between computers, with a database structure that verifies and validates each step in the process. The software architecture further eliminates the need to permanently store multiple versions of each book block, cover, web page thumbnail and corresponding ONIX meta data file, but rather generates and transmits these files only as needed, and only based upon those formats that a corresponding sales channel can accommodate.

EXAMPLE: BiblioLabs GBM Core Processing Engine - Before and After

Careers

Human capital It’s all about the people. Although BiblioLabs is a technology company, it is first and foremost a knowledge business. The Company prides itself on hiring only the most inspired, highly motivated and knowledgeable people, who enjoy working in a fast-paced environment.  If you think you have what it takes then we would like to hear from you! BiblioLabs is currently seeking individuals in the following areas:


Software Engineer

BiblioLabs is currently seeking a software developer with experience in the following areas:

  • Java and related ecosystem (2+ years)
  • XML and related technologies
  • Relational and document-oriented databases
  • Linux
  • Experience with the following is a plus: Mule (ESB), subversion, Hadoop (HDFS/MapReduce)

The position requires strong communication skills and the ability to work in a highly dynamic, iterative development environment.  The development process used requires the ability to write quality unit and integration tests as well as documentation.

A bachelors degree in computer science (or related field) is generally required. Compensation is commensurate with experience. BiblioLabs is located in the beautiful and vibrant downtown Charleston, South Carolina community.  To apply, please send your resume and salary requirements to jobs@bibliolabs.com.


Senior Imaging Software Engineer

BiblioLabs is currently seeking a senior level imaging software engineer to help drive the development of the Company’s Global Book Manager (GBM) software application. Specifically, this position will be focused on both new content acquisition and improving the processing of existing content. The position requires a minimum of 5 years experience in the following areas:

  • C++
  • Strong experience working with digital images (TIFF, JPEG, PNG etc…)
  • Knowledge of digital image processing (tonal curves, gamma correction, color management, masking, interpolation, descreening, deskewing, despeckling, etc…)
  • Strong working knowledge of PDF processing, as well as ebook format standards (EPUB, Kindle AZW, etc…)
  • Experience developing in Linux environment
  • Experience with Optical Character Recognition (OCR) a plus

A bachelors or masters degree in computer science (or related field) is generally required. Compensation is commensurate with experience.  To apply, please send your resume and salary requirements to jobs@bibliolabs.com.


Infrastructure Engineer / System Administrator

BiblioLabs is currently seeking a infrastructure engineer to help manage company growth as it scales its software platform.  This job requires knowledge of the following:

  • Linux (Ubuntu and CentOS)
  • Networking Technologies (DNS, DHCP, VPN, Firewalls, Routing etc…)
  • User Account Management (SSH, Mail)
  • Storage (RAID, Hadoop distributed storage) (200+ TB)
  • Backup processes, systems and software (currently use Bacula)
  • Virtualization (VMWare)
  • Job scheduling (Condor)
  • Apache, Tomcat (load balancing / clustering)
  • Monitoring (currently use Nagios and Centreon)
  • Procedures to efficiently manage large numbers of machines (cloning, configuring, updating, etc…) 250+ nodes

A bachelors degree in computer science (or related field) is generally required. Compensation is commensurate with experience.  To apply, please send your resume and salary requirements to jobs@bibliolabs.com.

Home

Long Tail GraphPreserving knowledge

BiblioLabs LLC is a hybrid software-media company, with a focus on using technology to give new life to historic books and other media. While there are tens of millions of books that are currently out-of-print and/or copyright, the high cost of manually recreating these works has resulted in only the most promising works getting resources for digitization and marketing. And how exactly does one determine which books have the most “promising” potential?  The Company took a decidedly different approach.

Selling Books on The Long Tail
When the Internet enabled the discovery of books in an unconstrained manner, a new market was born called the Long Tail. The Long Tail represents products that have a long shelf-life, but that typically sell in small quantities.

The BiblioLabs team decided to build a system that greatly reduced the cost of publishing any book, and thereby takes the risk out of the selection process. Moreover, it is a system that actually shares both the risk and the reward with the libraries and other institutions who have worked so long and hard to preserve this content.

Global Book Manager (GBM)
The company developed a scalable enterprise software application suite called Global Book Manager (GBM) that provides an end-end-end solution for the creation, marketing, enrichment and distribution of historic content.  While the Company’s BiblioLife digitization platform is decentralized (located anywhere in the world, thereby keeping costs down and protecting the historic content), the GBM platform is centralized inside a secure, scalable and proprietary data center, where large banks of computers can work on the content creation process 24 hours a day. The benefits to partners and customers include:

  • The highest possible quality product for consumers, with vigorous quality assurance standards
  • Availability of content in a wide variety of trim sizes (regular and large size), binding types (paperback and hardcover), formats (print and e-books) and languages
  • Broad content distribution, with availability in over 4,000 retailers, web sites and distributors
  • A wide variety of attractive book cover templates, including custom cover designs for unique collections
  • The ability for strategic retail partners to customize and/or co-brand the product
  • A content enhancement platform (BiblioLIfe) that allows institutions and web communities to earn revenue by adding valuable information about the product and thereby improve searchability and sales

About

What a long strange trip it’s been

The management team of BiblioLabs didn’t exactly just wake up one day with an idea. It was more of an evolution.

Although the founders all had very different backgrounds and skill sets, they were originally brought together by a common desire to see print-on-demand (POD) publishing become mainstream. The POD company they founded was aptly named BookSurge. Notably, many of the challenges with POD publishing were similar in nature to the ones faced by BiblioLabs in that they required an automated technology solution to help bring operational efficiencies to an age-old industry. After struggling for many years to achieve market acceptance by publishers, distributors and retailers, the dream finally reached a major milestone when BookSurge was acquired by Amazon.com in 2005 to broaden its catalog of offerings. POD and web retailing go hand in hand because neither have limited shelf space, yet by the same token, neither want to carry inventory.

Today POD has definitely become mainstream. Existing POD companies are expanding into international markets to bring the just-in-time manufacturing closer to the customer, while new POD competitors are emerging. No longer is it considered an inferior product relegated to self-published authors and second rate publishers.

Publishing out-of-print books was not new to the management team either. While working at Amazon company principals were involved in putting up over 100,000 titles for sale, working with libraries such as University of Michigan and Cornell to launch some of the first out-of print library catalogs to a mass internet audience.

While the mainstream acceptance of mediums such as print-on-demand books and e-books made the opportunity ripe to broaden the out-of-copyright collection, the age-old problem was still there. Namely, (a) how to identify the right books to create and market; and (b) how to do it in a manner that was cost-effective enough to offset the risk, given the unknown sales velocity…

So they set out to change the business model again.