How Digital Publishers Benefit from HTML?David Green
The digital revolution has changed the way scientists carry out their research and process and store their results. Hypertext Markup Language (HTML) and Portable Document Format (PDF) came into existence 15 to 20 years ago as alternatives to paper articles. Both have their place in publishing; however, HTML has increasingly become the standard for online use as it is more in tune with developments in the research process.
PDF is a wonderful format for print publishing and remains the preferred format for archiving and offline use. It has saved countless hours and dollars in publication management, from design to the printed page, and it has its own place as a design-to-press tool. However, the web is about searching, linking, chunking, and, increasingly in a mobile world, responsiveness. Users prefer the sharable and interactive aspects of HTML.
What Is HTML?
The elements of web pages – such as the appearance of images, links, headings, text, and page layout – can be formatted through HTML. While there are other web-programming languages and tools, like content management systems, HTML continues to be the predominant programming language for creating web pages.
Advantages of Using HTML in Publishing
HTML has many advantages over other publishing options currently available:
- It is search engine, browser, and mobile friendly.
- HTML was specifically designed for screen viewing giving it a more pleasing on-screen appearance than the more printer focused PDFs.
- It has enriched interactive content.
- It is easy to share by link (even when a file is large).
- It contains up-to-date and linked content.
- It is linked with data repositories.
- It can include supplementary material.
- It has a smaller file size than PDFs allowing for a more direct translation of the content.
- It uses progressive loading enabling access to the content as it’s viewed rather than waiting until everything is loaded.
- It allows for selectable text, and even when text is rendered as an image, alternate text can be provided for screen readers.
Search Engines Favor HTML
If your HTML code is clean and validated, an HTML-based article is the easiest format for search-engine crawlers to access and read.
First, HTML tells search engines’ searchbots, such as GoogleBot, to find items such as images, videos, scripts, and style sheets and to index your content. Having an article that is semantically coded could be the nudge your publishing needs to increase the ability and accuracy of indexing. (Semantic coding describes the content [e.g., a first-level heading] rather than the appearance [e.g., boldface]).
Second, the meta tags in your HTML article give search engines information about your web page when they index it. Meta tags are little bits of text that describe your webpage’s content.
Finally, HTML5 allows for indexing of multimedia content, such as menus, audio, and video, with new markup tags. This reduces crawling time and improves page load time, which can boost your website in the search engine result pages (SERPs).
PDFs can be indexed by search engines, but they lack the tag structure that ranks content for target keywords. PDFs also are not effective for image search engine optimization (SEO).
Browsers Support HTML
More browsers support HTML than any other web-programming language. So, when you build a website using HTML, it shows up on most browsers worldwide, as long as the programmer optimizes the website for the most commonly used browsers. Optimizing HTML-based content for browser compatibility is straightforward.
PDFs require plugins to be read, and these plugins are often incompatible with web browsers.
Mobile Optimization with HTML
HTML is mobile friendly, which is important since there are many types of mobile devices. In contrast, it’s nearly impossible to provide a responsive design for a PDF opened using a mobile device.
Flexibility, Usability, Customization, and Development of HTML
You want dynamic content that allows the user to interact with it instead of just looking at it. HTML, particularly HTML5, can transform how users interact with your content. It gives you the power to create content that can be accessed anywhere, whenever it’s most convenient for the reader. It even supports offline storage, so your readers can access it at a later time without connecting to the Internet.
HTML content also brings people back to your site by allowing you to bookmark the URL for later review, share the link in SM, and it is easier to share via social media, which means your material can be freely promoted and shared by the public (earned media).
Furthermore, HTML content can be rendered quickly by users across devices. (Rendering occurs when the HTML coding is turned into what the user sees [e.g., “<b>” is set as boldface type].) If set up properly, users can still print the content to PDF if they wish to store it.
Another major advantage of HTML is that it is free. Unlike with some open source content management systems, you do not need to buy software or plugins, so you can save considerably on your website-development costs.
Almost everyone in web development – whether a freelancer or a large agency – knows HTML. It’s not hard to find providers who can cost-effectively update your content.
HTML can also be customized easily. There are more web-development tools (e.g., FrontPage, DreamWeaver) that allow you to create HTML-based publishing content than for any other web-programming language. HTML is relatively inexpensive to produce and in many cases the cost of distribution (such as website dissemination, sharing, offline storage, and views) is ZERO.
PDF-to-HTML Plugin for OJS 3
At OpenJournalSystems.com, our OJS PDF-to-HTML Plugin generates HTML content from PDF files. These can then be uploaded to OJS as HTML galley files that maintain the fonts, images, tables, hyperlinks, and tables of contents. We’re sure this dynamic new plugin will be a welcome tool in your online publishing toolbox. Please contact us to arrange for a demo.
Works Cited or Consulted
Aalbersberg IJ. PDF versus HTML — which do researchers prefer? [Internet] Elsevier [Cited 25 Apr 2017] Available at: https://www.elsevier.com/connect/pdf-versus-html-which-do-researchers-prefer Advantages of HTML. [Internet] Vtech SEO. [Cited 25 Apr 2017] Available at:
Boag P. Semantic code. What? Why? How? [Internet]. Boagworks and Boagworld. 2005 [cited 24 Apr 2017]. Available at:
Driscoll Miller J. Are PDFs optimal for SEO? The pros and cons. [Internet] Search Engine Land 2014 [Cited 25 Apr 2017]. Available at:
Huntley S and Zender C. PDFs vs. HTML. Can’t we all just get along? [Internet] SAS Institute [Cited 25 Apr 2017]. Available at:
Lohman T. Time to break up with pdfs? [Internet] AccessIQ. 2014 [Cited 25 Apr 2017]. Available at:
Meta tags: how Google meta tags impact SEO. [Internet]. WordStream [cited 24 Apr 2017]. Available at:
McBurnett N. PDF vs HTML. [Internet] Boulder Community Network 2008 [Cited 25 Apr 2017]. Available at:
Putney J. 10 reasons to consider HTML for digital publishing + Getting started [Internet] LinkedIn Corporation. 2014 [Cited Apr 25 2017]. Available at:
What does it mean to “render” as web page? [Internet] PathInteractive [cited 24 Apr 2017]. Available at: