Skip to main content

Import HTML (Multi File)

If you have HTML content that is in multiple files, you may be able to import it by using Paligo's Confluence import feature. The success of the import is dependent on the structure of the HTML.

Note

Select the Confluence (index.html) option. Do not select the HTML import option as that is for single-file HTML imports only.

Important

Paligo can only import valid HTML. We recommend that you test your HTML by using a third-party HTML validator tool such as https://validator.w3.org/.

The import will fail if the HTML is invalid (Paligo recognises the content is invalid and reports "nothing to import").

To import multiple HTML files at once, you need to organize your HTML content in a particular way. You also need to create or edit an "index.html" file so that it contains an unordered list that can act as a table of contents, with links to the various HTML files you are importing.

Note

Ensure that the source files are valid by using validators like Tidy or W3C.

HTML is not a structured format. The import will handle many flavors of HTML, but because of the many proprietary variants there is no guarantee it will work for yours. If you have problems, inquire about the possibilities to tweak the content.

This HTML format only imports one file at a time. To import multiple HTML files, use the "Confluence" import.

If it is not an actual Confluence import, the "index.html" needs to be manually edited to incorporate a ul class="toc" that contains the publication structure to be imported.

  1. Prepare your HTML content files. Make sure they have valid HTML structure, for example:

    <!DOCTYPE html>
    <html>
    <head>
        <title>Sample HTML Page</title>
    </head>
    <body>
        <h1>This is a Heading Level 1</h1>
        <p>This is some introductory text for our sample HTML page.</p>
        
        <h2>Heading Level 2</h2>
        <p>More text can go here to explain things in detail.</p>
        
        <h3>Heading Level 3</h3>
        <ul>
            <li>Item 1</li>
            <li>Item 2</li>
            <li>Item 3</li>
        </ul>
        
        <h3>Heading Level 3</h3>
        <ol>
            <li>First item</li>
            <li>Second item</li>
            <li>Third item</li>
        </ol>
        
        <h2>Another Heading Level 2</h2>
        <p>More text goes here. Below is an image:</p>
        <img src="https://example.com/sample-image.jpg" alt="Sample Image">
        
        <h2>Table Example</h2>
        <table border="1">
            <tr>
                <th>Header 1</th>
                <th>Header 2</th>
                <th>Header 3</th>
            </tr>
            <tr>
                <td>Row 1, Cell 1</td>
                <td>Row 1, Cell 2</td>
                <td>Row 1, Cell 3</td>
            </tr>
            <tr>
                <td>Row 2, Cell 1</td>
                <td>Row 2, Cell 2</td>
                <td>Row 2, Cell 3</td>
            </tr>
        </table>
    </body>
    </html>

    Tip

    Don't forget that the <title> tag in HTML is not used as the main heading, it is only for the title of the document. Use <h1> for the main heading, and then lower-level <h2>, <h3> tags for subheadings.

  2. Organize your HTML content like this:

    • Parent "container" folder

      • Index.html

      • Images folder

        • Image files

      • CSS folder

        • CSS files

      • Content folder

        • HTML files

    It is important that the "container" folder contains one Index.html file at the root level, with all other HTML pages inside a subfolder.

    For example:

    Folder structure shown in Mac iOS. There is a parent folder at the top level. Inside that, there is an index.html file, an images folder, a CSS folder, and a Content folder. The folders are expanded to show that they contain images, a css file, and html files for the content.

    Note

    When importing HTML, the folder names are case-sensitive.

    Note

    Make sure your Content folder only contains HTML files. It should not contain other types of files, such as .doc, .js, .css.

  3. Now you need to edit the index.html file. Use a text editor or code editor to open the file and add this structure to it:

    <!DOCTYPE html><html>
    <head>
        <title>Import</title>
    </head>
    <body>
        <ul class="toc">
            <li><a href="enter relative link path here">Link text</a></li>
            <li><a href="enter relative link path here">Link text</a></li>
            <li><a href="enter relative link path here">Link text</a></li>
            <li><a href="enter relative link path here">Link text</a></li>
            <li><a href="enter relative link path here">Link text</a></li>
        </ul>
    </body>
    </html>

    Where:

    • doctype, html, head, title, and body are the basic structure as used in all HTML files.

    • ul defines the start of an unordered list (bullet list) and it has to have the class name "toc". This class is important for the import process and the import will not work correctly without it.

    • li defines a list item

    • a href defines a link. This index.html file needs a list item and link for every html page that you want the import to bring into Paligo. The link needs to be a relative link and should be to an html file that is stored in the "content folder" inside the "container" folder.

      To learn more about relative links, see w3schools.com/html_filepaths.

    Note

    When adding a link, be aware that the folder names are case-sensitive.

  4. When you have added the links, save the index.html file.

    Tip

    We recommend that you use an HTML validator to test your HTML file, such as https://www.html-tidy.org. This will help to identify and potentially fix any structural problems in your HTML before you attempt to import it.

  5. Next, use your computer's operating system or a third-party application to make a zip file of the "container" folder.

    Note

    To learn how to zip a file, see the operating system's documentation:

  6. Use the Import Wizard to import the zip file. Select the Confluence format to be able to import multiple files.

If your HTML import does not work, we recommend that you:

  • Validate your HTML content. Paligo will only import correctly formed HTML content.

    There are many HTML validation tools you can use, such as https://www.html-tidy.org.

  • Make sure that you have organized the content as described. The arrangement of the files and folders in the zip file that you import is vital to the success of the import.

  • Check that your HTML files use relative references to images and CSS.

Note

If you fix these issues and the import still fails to work, contact customer support for assistance.