2.6 Configuring HTML Rewriting

Access Gateway configurations generally require HTML rewriting because the Web servers are not aware that the Access Gateway machine is obfuscating their DNS names. URLs contained in their pages must be checked to ensure that these references contain the DNS names that the client browser understands. On the other end, the client browsers are not aware that the Access Gateway is obfuscating the DNS names of the resources they are accessing.

The URL requests coming from the client browsers that use published DNS names must be rewritten to the DNS names that the Web servers expect. Figure 2-6 illustrates these processes.

Figure 2-6 HTML Rewriting

The following sections describe the HTML rewriting process:

2.6.1 Understanding the Rewriting Process

The Access Gateway needs to rewrite URL references under the following conditions:

  • To ensure that URL references contain the proper scheme (HTTP or HTTPS).

    If your Web servers and Access Gateway machines are behind a secure firewall, you might not require SSL sessions between them, and only require SSL between the client browser and the Access Gateway. For example, an HTML file being accessed through the Access Gateway for the Web site novell.com might have a URL reference to http://novell.com/path/image1.jpg. If the reverse proxy for novell.com/path is using SSL sessions between the browser and Access Gateway, the URL reference http://novell.com/path/image1.jpg must be rewritten to https://novell.com/path/image1.jpg. Otherwise, when the user clicks the HTTP link, the browser must change from HTTP to HTTPS and establish a new SSL session.

  • To ensure that URL references containing private IP addresses or private DNS names are changed to the published DNS name of the Access Gateway or hosts.

    For example, suppose that a company has an internal Web site named data.com, and wants to expose this site to Internet users through the Access Gateway by using a published DNS name of novell.com. Many of the HTML pages on this Web site have URL references that contain the private DNS name, such as http://data.com/imagel.jpg. Because Internet users are unable to resolve data.com/imagel.jpg, links using this URL reference would return DNS errors in the browser.

    The HTML rewriter can resolve this issue. The DNS name field in the Access Gateway configuration is set to novell.com, which users can resolve through a public DNS server to the Access Gateway. The rewriter parses the Web page, and any URL references matching the private DNS name or private IP address listed in the Web server address field of the Access Gateway configuration are rewritten to the published DNS name novell.com and the port number of the Access Gateway.

    Rewriting URL references addresses two issues: 1) URL references that are unreachable because of the use of private DNS names or IP addresses are now made accessible and 2) Rewriting prevents the exposure of private IP addresses and DNS names that might be sensitive information.

  • To ensure that the Host header in incoming HTTP packets contains the name understood by the internal Web server.

    Using the example in Figure 2-6, suppose that the internal Web server expects all HTTP or HTTPS requests to have the Host field set to data.com. When users send requests using the published DNS name novell.com/path, the Host field of the packets in those requests received by the Access Gateway is set to novell.com. The Access Gateway can be configured to rewrite this public name to the private name expected by the Web server by setting the Web Server Host Name option to data.com. Before the Access Gateway forwards packets to the Web server, the Host field is changed (rewritten) from novell.com to data.com. For information about configuring this option, see Section 2.3, Configuring Web Servers of a Proxy Service.

The rewriter searches for URLs in the following HTML contexts. They must meet the following criteria to be rewritten:

Context

Criteria

HTTP Headers

Qualified URL references occurring within certain types of HTTP response headers such as Location and Content-Location are rewritten. The Location header is used to redirect the browser to where the resource can be found. The Content-Location header is used to provide an alternate location where the resource can be found.

JavaScript

Within JavaScript, absolute references are always evaluated for rewriting. Relative references (such as index.html) are not attempted. Absolute paths (such as /docs/file.html) are evaluated if the page is read from a path-based multi-homing Web server and the reference follows an HTML tag. For example, the string href='/docs/file.html' is rewritten if /docs is a multi-homing path that has been configured to be removed.

HTML Tags

URL references occurring within the following HTML tag attributes are evaluated for rewriting:

action                  archive             background
cite                    code                codebase
data                    dynscr              filterLink
href                    longdesc            lowsrc
o:WebQuerySourceHref    onclick             onmenuclick
pluginspage             src                 usemap
usermapborderimage

References

An absolute reference is a reference that has all the information needed to locate a resource, including the hostname, such as http://internal.web.site.com/index.html. The rewriter always attempts to rewrite absolute references.

The rewriter attempts to rewrite an absolute path when it is the multi-homing path of a path-based multi-homing service. For example, /docs/file1.html is rewritten if /docs is a multi-homing path that has been configured to be removed.

Relative references are not rewritten.

Query Strings

URL references contained within query strings can be configured for rewriting by enabling the Rewrite Inbound Query String Data option.

Post Data

URL references specified in Post Data can be configured for rewriting by enabling the Rewrite Inbound Post Data option.

2.6.2 Specifying DNS Names to Rewrite

The rewriter parses and searches the Web content that passes through the Access Gateway for URL references that qualify to be rewritten. URL references are rewritten when they meet the following conditions:

  • URL references containing DNS names or IP addresses matching those in the Web server address list are rewritten with the Published DNS Name.

  • URL references matching the Web Server Host Name are rewritten with the Published DNS Name.

  • URL references matching entries in the Additional DNS Name List of the host are rewritten with the Published DNS Name. The Web Server Host Name does not need to be included in this list.

  • The DNS names in the Exclude DNS Name List specify the names that the rewriter should skip and not rewrite.

NOTE:Excludes in the Exclude DNS Name List are processed first, then the includes in the Additional DNS Name List. If you put the same DNS name in both lists, the DNS name is rewritten.

The following sections describe the conditions to consider when adding DNS names to the lists:

Determining Whether You Need to Specify Additional DNS Names

Sometimes Web pages contain URL references to a hostname that does not meet the default criteria for being rewritten. That is, the URL reference does not match the Web Server Host Name or any value (IP address) in the Web Server List. If these names are sent back to the client, they are not resolvable. Figure 2-7 illustrates a scenario that requires an entry in the Additional DNS Name List.

Figure 2-7 Rewriting a URLs for Web Servers

The page on the data.com Web server contains two links, one to an image on the data.com server and one to an image on the graphics.com server. The link to the data.com server is automatically rewritten to novell.com, when rewriting is enabled. The link to the image on graphics.com is not rewritten, until you add this URL to the Additional DNS Name List. When the link is rewritten, the browser knows how to request it, and the Access Gateway knows how to resolve it.

You need to include names in this list if your Web servers have the following configurations:

  • If you have a cluster of Web servers that are not sharing the same DNS name, you need to add their DNS names to this list.

  • If your Web server obtains content from another Web server, the DNS name for this additional Web server needs to be added to the list.

  • If the Web server listens on one port (for example, 80), and redirects the request to a secure port (for example, 443), the DNS name needs to be added to the list. The response to the user comes back on https://<DNS_name>:443. This does not match the request that was sent on http://<DNS_name>:80. If you add the DNS name to the list, the response can be sent in the format that the user expects.

  • If an application is written to use a private hostname, you need to add the private hostname to the list. For example, assume that an application URL reference contains the hostname of home (http://home/index.html). This hostname needs to be added to the Additional DNS Name List.

  • If you enable the Forward Received Host Name option on your path-based multi-homing service and your Web server is configured to use a different port, you need to add the DNS name with the port to the Additional DNS Name List.

    For example, if the public DNS name of the proxy service is www.myag.com, the path for the path-based multi-homing service is /sales, and the Web server port is 801, the following DNS name needs to be added to the Additional DNS Name List of the /sales service:

    http://www.myag.com:801
    

When you enter a name in the list, it can use any of the following formats:

DNS_name
host_name
IP_address
scheme://DNS_name
scheme://IP_address
scheme://DNS_name:port
scheme://IP_address:port

For example:

HOME
https://www.backend.com
https://10.10.15.206:444

These entries are not case sensitive.

Determining Whether You Need to Exclude DNS Names from Being Rewritten

If you have two reverse proxies protecting the same Web server, the rewriter correctly rewrites the references to the Web server so that browser always uses the same reverse proxy. In other words, if the browser requests a resource using novell.com.uk, the response is returned with references to novell.com.uk and not novell.com.usa.

If you have a third reverse proxy protecting a Web server, the rewriting rules can become ambiguous. For example, consider the configuration illustrated in Figure 2-8.

Figure 2-8 Excluding URLs

A user accesses data.com through the published DNS name of novell.com.mx. The data.com server has references to product.com. The novell.com.mx proxy has two ways to get to the product.com server because this Web server has two published DNS names (novell.com.uk and novell.com.usa). The rewriter could use either of these names to rewrite references to product.com.

  • If you want all users coming through novell.com.mx to use the novell.com.usa proxy, you need to block the rewriting of product.com to novell.com.uk. On the HTML Rewriting page of the reverse proxy for novell.com.uk, add product.com and any aliases to the Exclude DNS Name List.

  • If you do not care which proxy is returned in the reference, you do not need to add anything to the Exclude DNS Names List.

2.6.3 Defining the Requirements for the Rewriter Profile

An HTML rewriter profile allows you to customize the rewriting process and specify the profile that is selected to rewrite content on a page. This section describes the following features of the rewriter profile:

Types of Rewriter Profiles

The Access Gateway has the following types of profiles:

Default Word Profile

The default Word profile, named default, is not specific to a reverse proxy or its proxy services.

If you enable HTML rewriting, but you do not define a custom Word profile for the proxy service, the default Word profile is used. This profile is preconfigured to rewrite the Web Server Host Name and any other names listed in the Additional DNS Name List. The preconfigured profile matches all URLs with the following content-types:

text/html

text/javascript

text/xml

application/javascript

text/css

application/x-javascript

When you modify the behavior of the default profile, remember its scope. If the default profile does not match your requirements, you should usually create your own custom Word profile or custom Character profile.

Custom Word Profile

A Word profile searches for matches on words. For example, “get” matches the word “get” and any word that begins with “get” such as “getaway” but it does not match the “get” in “together” or “beget.”

For information about how strings are replaced in a Word profile, see the following:

You should create a custom Word profile when an application requires rewrites of paths in JavaScript. If the application needs strings replaced or new content-types, these can also be added to the custom profile. In a custom Word profile, you can also configure the match criteria so that the profile matches specific URLs. For more information, see Page Matching Criteria for Rewriter Profiles.

When you create a custom Word profile, you need to position it before the default profile in the list of profiles. Only one Word profile is applied per page. The first Word profile that matches the page is applied. Profiles lower in the list are ignored.

Custom Character Profile

A custom Character profile searches for matches on a specified set of characters. For example, “top” matches the word “top” and the “top” in “tabletop,” “stopwatch,” and “topic.” If you need to replace strings that require this type of search, you should create a custom Character profile.

For information about how strings are replaced in a Character profile, see String Replacement Rules for Character Profiles.

In a custom Character profile, you can also configure the match criteria so that the profile matches specific URLs. For more information, see Page Matching Criteria for Rewriter Profiles.

After the rewriter finds and applies the Word profile that matches the page, it finds and applies one Character profile. The first Character profile that matches the page is applied. Character profiles lower in the list are ignored.

Page Matching Criteria for Rewriter Profiles

You specify the following matching criteria for selecting the profile:

  • The URLs to match

  • The URLs that cannot match

  • The content types to match

You use the Requested URLs to Search section of the profile to set up the matching policy. The first Word profile and the first Character profile that matches the page is applied. Profiles lower in the list are ignored.

URLs: The URLs specified in the policy should use the following formats:

Sample URL

Description

http://www.a.com/content

Matches pages only if the requested URL does not contain a trailing slash.

http://www.a.com/content/

Matches pages only if the requested URL does contain a trailing slash.

http://www.a.com/content/index.html

Matches only this specific file.

http://www.a.com/content/*

Matches the requested URL whether or not it has a trailing slash and matches all files in the directory.

http://www.a.com/*

Matches the proxy service and everything it is protecting.

You can specify two types of URLs. In the If Requested URL Is list, you specify the URLs of the pages you want this profile to match. In the And Requested URL Is Not list, you specify the URLs you don’t want this profile to match. You can use the asterisk wildcard for a URL in the If Requested URL Is list to match pages you really don’t want this profile to match, then use a URL in the And Requested URL Is Not list to exclude them from matching. If a page matches both a URL in the If Requested URL Is list and in the And Requested URL Is Not list, the profile does not match the page.

For example, you could specify the following URL in the If Requested URL Is list:

http://www.a.com/*

You could then specify the following URL in the And Requested URL Is Not list:

http://www.a.com/content/*

These two entries cause the profile to match all pages on the www.a.com Web server except for the pages in the /content directory and its subdirectories.

IMPORTANT:If nothing is specified in either of the two lists, the profile skips the URL matching requirements and uses the content-type to determine if a page matches.

Content-Type: In the And Document Content-Type Is section, you specify the content-types you want this profile to match. To add a new content-type, click New and specify the name, such as text/dns. Search your Web pages for content-types to determine if you need to add new types. To add multiple values, enter each value on a separate line.

Regardless of content-types you specify, the page matches the profile if the file extension is html, htm, shtml, jhtml, asp, or jsp and you have not specified any URL matching criteria.

Possible Actions for Rewriter Profiles

The rewriter action section of the profile determines the actions the rewriter performs when a page matches the profile. Select from the following:

Inbound Actions: A profile might require these options if the proxy service has the following characteristics:

  • URLs appear in query strings, Post Data, or headers.

  • The Web server uses WebDAV methods.

If your profile needs to match pages from this type of proxy service, you might need to enable the options listed below. They control the rewriting of query strings, Post Data, and headers from the Access Gateway to the Web server.

  • Rewrite Inbound Query String Data: Select this option to rewrite the domain and URL in the query string to match the Web server configuration or to remove the path from the query string on a path-based multi-homing proxy with the Remove Path on Fill option enabled.

  • Rewrite Inbound Post Data: Select this option to rewrite the domain and URL in the Post Data to match the Web server configuration or to remove the path from the Post Data on a path-based multi-homing proxy with the Remove Path on Fill option enabled.

  • Rewrite Inbound Headers: Select this option to rewrite the following headers:

    • Call-Back
    • Destination
    • If
    • Notification-Type
    • Referer

The inbound options are not available for a Character profile.

Enabling or Disabling Rewriting: The Enable Rewriter Actions option determines whether the rewriter performs any actions:

  • Select the option to have the rewriter rewrite the references and data on the page.

  • Leave the option deselected to disable rewriting. This allows you to create a profile for the pages you do not want rewritten.

Additional Names to Search for URL Strings to Rewrite with Host Name: Use this section to specify the name of the variable, attribute, or method in which the hostname might appear. These options are not available for a Character profile.

  • Variable and Attribute Name to Search for Is: Use this section to specify the HTML attributes or JavaScript variables that you want searched for DNS names that might need to be rewritten. For the list of HTML attribute names that are automatically searched, see HTML Tags. You might want to add the following attributes:

    • value: This attribute enables the rewriter to search the <param> elements on the HTML page for value attributes and rewrite the value attributes that are URL strings.

      If you need more granular control (some need to be rewritten but others do not) and you can modify the page, see Disabling with Page Modifications.

    • formvalue: This attribute enables the rewriter to search the <form> element on the HTML page for <input>, <button>, and <option> elements and rewrite the value attributes that are URL strings. For example, if your multi-homing path is /test and the form line is <input name="navUrl" type="hidden" value="/IDM/portal/cn/GuestContainerPage/656gwmail">, this line would be rewritten to the following value before sending the response to the client:

      <input name="navUrl" type="hidden" value="/test/IDM/portal/cn/GuestContainerPage/656gwmail">
      

      The formvalue attribute enables the rewriting of all URLs in the <input>, <button>, and <option> elements in the form.If you need more granular control (some need to be rewritten but others do not) and you can modify the form page, see Disabling with Page Modifications.

  • Replacing URLs in Java Methods: The JavaScript Method to Search for Is list allows you to specify the Java methods to search to see if their parameters contain a URL string.

String Replacement: The Additional Strings to Replace list allows you to search for a string and replace it. The search boundary (word or character) that you specified when creating the profile is used when searching for the string.

Word profile search and replace actions take precedence over character profile actions.

For the rules and tokens that can be used in the search strings, see the following:

For information about how the Additional Strings to Replace list can be used to reduce the number of Java methods you need to list, see Using $path to Rewrite Paths in JavaScript Methods or Variables.

String Replacement Rules for Word Profiles

In a Word profile, a string matches all paths that start with the characters in the specified string. For example:

Search String

Matches This String

Doesn’t Match This String

/path

/path

/pathother

/path/other

/path.html

/mypath

String Tokens

On the Access Gateway Service, you can use the following special tokens to modify the default matching rules. The Access Gateway Appliance does not support these tokens.

  • [w] to match one white space character

  • [ow] to match 0 or more white space characters

  • [ep] to match a path element in a URL path, excluding words that end in a period

  • [ew] to match a word element in a URL path, including words that end in a period

  • [oa] to match one or more alphanumeric characters

White Space Tokens: You use the [w] and the [ow] tokens to specify where white space might occur in the string. For example:

[ow]my[w]string[w]to[w]replace[ow]

If you don’t know, or don’t care, whether the string has zero or more white characters at the beginning and at the end, use [ow] to specify this. The [w] specifies exactly one white character.

Path Tokens: You use the [ep] and [ew] tokens to match path strings. The [ep] token can be used to match the following types of paths:

Search String

Matches This String

Doesn’t Match This String

/path[ep]

/path

/home/path/other

/path.html

/home/pathother

The [ew] token can be used to match the following types of paths:

Search String

Matches This String

Does not Match This String

/path[ew]

/path.html

/home/path

/paths

Name Tokens: You use the [oa] token to match function or parameter names that have a set string to start the name and end the name, but the middle part of the name is a computer-generated alphanumeric string. For example, the [oa] token can be used to match the following types of names:

Search String

Matches This String

Doesn’t Match This String

javaFunction-[oa](

javaFunction-1234a56()

javaFunction-a()

javaFunction()

String Replacement Rules for Character Profiles

When you configure multiple strings for replacement, the rewriter uses the following rules for determining how characters are replaced in strings:

  • String replacement is done as a single pass.

  • String replacement is not performed recursively. Suppose you have listed the following search and replacement strings:

    DOG     to be replaced with     CAT
    A       to be replaced with     O
    

    All occurrences of the string DOG are replaced with CAT, regardless of whether it is the word DOG or the word DOGMA. Only one replacement pass occurs. The rewritten CAT is not replaced with COT.

  • Because string replacement is done in one pass, the string that matches first takes precedence. Suppose you have listed the following search and replacement strings:

    ABC       to be replaced with     XYZ
    BCDEF     to be replaced with     PQRSTUVWXYZ
    

    If the original string is ABCDEFGH, the replaced string is XYZDEFGH.

  • If two specified search strings match the data portion, the search string of longer length is used for the replacement except for the case detailed above. Suppose you have listed the following search and replacement strings:

    ABC        to be replaced with     XYZ
    ABCDEF     to be replaced with     PQRSTUVWXYZ
    

    If the original string is ABCDEFGH, the replaced string is PQRSTUVWXYZGH.

Using $path to Rewrite Paths in JavaScript Methods or Variables

You can use the $path token to rewrite paths on a path-based multi-homing service that has the Remove Path on Fill option enabled. This token is useful for Web applications that require a dedicated Web server and are therefore installed in the root directory of the Web server. If you protect this type of application with Access Manager using a path-based multi-homing service, your clients access the application with a URL that contains a /path value. The proxy service uses the path to determine which Web server a request is sent to, and the path must be removed from the URL before sending the request to the Web server.

The application responds to the requests. If it uses JavaScript methods or variables to generate paths to resources, these paths are sent to client without prepending the path for the proxy service. When the client tries to access the resource specified by the Web server path, the proxy service cannot locate the resource because the multi-homing path is missing. The figure below illustrates this flow with the rewriter adding the multi-homing path in the reply.

Figure 2-9 Rewriting with a Multi-homing Path

To make sure all the paths generated by JavaScript are rewritten, you must search the Web pages of the application. You can then either list all the JavaScript methods and variables in the Additional Names to Search for URL Strings to Rewrite with Host Name section of the rewriter profile, or you can use the $path token in the Additional Strings to Replace section. The $path token reduces the number of JavaScript methods and variables that you otherwise need to list individually.

To use the $path token, you add a search string and a replace string that uses the token. For example, if the /prices/pricelist.html page is generated by JavaScript and the multi-homing path for the proxy service is /inner, you would specify the following stings:

Search String

Replacement String

/prices

$path/prices

This configuration allows the following paths to be rewritten before the Web server sends the information to the browser.

Web Server String

Rewritten String for the Browser

/prices/pricelist.html

/inner/prices/pricelist.html

/prices

/inner/prices

This token can cause strings that shouldn’t be changed to be rewritten. If you enable the Rewrite Inbound Query String Data, Rewrite Inbound Post Data, and Rewrite Inbound Header actions, the rewriter checks these strings and ensures that they contain the information the Web server expects. For example, when these options are enabled, the following paths and domain names are rewritten when found in query strings, in Post Data, or in the Call-Back, Destination, If, Notification-Type, or Referer headers.

Browser String

Rewritten String for the Web Server

/inner/prices/pricelist.html

/prices/pricelist.html

/inner/prices

/prices

novell.com/inner/prices

inner.com/prices

2.6.4 Configuring the HTML Rewriter and Profile

You configure the HTML rewriter for a proxy service, and these values are applied to all Web servers that are protected by this proxy service.

To configure the HTML rewriter:

  1. In the Administration Console, click Devices > Access Gateways > Edit > [Name of Reverse Proxy] > [Name of Proxy Service] > HTML Rewriting.

    Configuring HTML rewriting

    The HTML Rewriting page specifies which DNS names are to be rewritten. The HTML Rewriter Profile specifies which pages to search for DNS names that need to be rewritten.

  2. Select Enable HTML Rewriting.

    This option is enabled by default. When it is disabled, no rewriting occurs.When enabled, this option activates the internal HTML rewriter. This rewriter replaces the name of the Web server with the published DNS name when sending data to the browsers. It replaces the published DNS name with the Web Server Host Name when sending data to the Web server. It also makes sure the proper scheme (HTTP or HTTPS) is included in the URL. This is needed because you can configure the Access Gateway to use HTTPS between itself and client browsers and to use HTTP between itself and the Web servers.

  3. In the Additional DNS Name List section, click New, specify a DNS that appears on the Web pages of your server (for example a DNS name other than the Web server’s DNS name), then click OK.

    For more information, see Determining Whether You Need to Specify Additional DNS Names.

  4. In the Exclude DNS Name List section, click New, specify a DNS name that appears on the Web pages of your server that you do not want rewritten, then click OK.

    For more information, see Determining Whether You Need to Exclude DNS Names from Being Rewritten.

  5. Use the HTML Rewriter Profile List to configure a profile. Select one of the following actions:

    • New: To create a profile, click New. Specify a display name for the profile and select either a Word or Character for the Search Boundary. Continue with Section 2.6.5, Creating or Modifying a Rewriter Profile.

      • Word: A Word profile searches for matches on words. For example, “get” matches the word “get” and any word that begins with “get” such as “getaway” but it does not match the “get” in “together” or “beget.”

        If you create multiple Word profiles, order is important. The first Word profile that matches the page is applied. Word profiles lower in the list are ignored.

      • Character: A Character profile searches for matches on a specified set of characters. For example, “top” matches the word “top” and the “top” in “tabletop,” “stopwatch,” and “topic.”

        If you want to add functionality to the default profile, create a Character profile. It has all the functionality of a Word profile, except searching for attribute names and Java variables and methods. If you create multiple Character profiles, order is important. The first Character profile that matches the page is applied. Character profiles lower in the list are ignored.

    • Delete: To delete a profile, select the profile, then click Delete.

    • Enable: To enable a profile, select the profile, then click Enable.

    • Disable: To disable a profile, select the profile, then click Disable.

    • Modify: To view or modify the current configuration for a profile, click the name of the profile. Continue with Section 2.6.5, Creating or Modifying a Rewriter Profile.

      The default profile is designed to be applied to all pages protected by the Access Gateway. It is not specific to a reverse proxy or its proxy services. If you modify its behavior, remember its scope. Rather than modify the default profile, you should create your own custom Word profile and enable it.

  6. If you have more than one profile in the HTML Rewriter Profile List, use the up-arrow and down-arrow buttons to order the profiles.

    If you create more than one profile, order becomes important. For example if you want to rewrite all pages with a general rewriter profile (with a URL such as /*) and one specific set of pages with another rewriter profile (with a URL such as /doc/100506/*), you need to have the specific rewriter profile listed before the general rewriter profile.

    Even if multiple Word or Character profiles are enabled, a maximum of one Word profile and one Character profile is executed per page. The first Word profile and Character profile in the list that matches a page are executed, and the others are ignored.

  7. Enable the profiles you want to use for this protected resource. Select the profile, then click Enable.

    The default profile cannot be disabled. However, it is not executed if you have enabled another Word profile that matches your pages, and this profile comes before the default profile in the list.

  8. To save your changes to browser cache, click OK.

  9. To apply your changes, click the Access Gateways link, then click Update > OK.

  10. The cached pages affected by the rewriter changes must be updated on the Access Gateway. Do one of the following:

    • If the changes affect numerous pages, click Access Gateways, select the name of the server, then click Actions > Purge All Cache.

    • If the changes affect only a few pages, you can refresh or reload the pages within the browser.

2.6.5 Creating or Modifying a Rewriter Profile

  1. In the Administration Console, click Devices > Access Gateways > Edit > [Name of Reverse Proxy] > [Name of Proxy Service] > HTML Rewriting.

  2. Select one of the following:

    • To create a new profile, click New, specify a name, select a profile type, then click OK.

    • To modify a profile, click the name of the profile.

  3. Use the Requested URLs to Search section to set up a policy for specifying the URLs you want this profile to match.

    Specifying which pages to search

    Fill in the following fields:

    If Requested URL Is: Specify the URLs of the pages you want this profile to match. Click New to add a URL to the text box. To add multiple values, enter each value on a separate line.

    And Requested URL Is Not: Specify the URLs of pages that this profile should not match. If a page matches the URL in both the If Requested URL Is list and And Requested URL Is Not list, the profile does not match the page. Click New to add a URL to the text box. To add multiple values, enter each value on a separate line.

    And Document Content-Type Is: Select the content-types you want this profile to match. To add a new content-type, click New and specify the name such as text/dns. Search your Web pages for content-types to determine if you need to add new types. To add multiple values, enter each value on a separate line.

    For more information about how to use these options, see Page Matching Criteria for Rewriter Profiles.

  4. Use the Actions section to specify the actions the rewriter should perform if the page matches the criteria in the Requested URLs to Search section.

    Configure the following actions:

    Rewrite Inbound Query String Data: (Not available for Character profiles) Select this option to rewrite the domain and URL in the query string to match the Web server. To use this option, your proxy service must meet the conditions listed in Possible Actions for Rewriter Profiles.

    Rewrite Inbound Post Data: (Not available for Character profiles) Select this option to rewrite the domain and URL in the Post Data to match the Web server. To use this option, your proxy service must meet the conditions listed in Possible Actions for Rewriter Profiles.

    Rewrite Inbound Headers: Select this option to rewrite the following headers:

    • Call-Back
    • Destination
    • If
    • Notification-Type
    • Referer

    Enable Rewriter Actions: Select this action to enable the rewriter to perform any actions:

    • Select it to have the rewriter use the profile to rewrite references and data on the page. If this option is not selected, you cannot configure the action options.

    • Leave it unselected to disable rewriting. This allows you to create a profile for the pages you do not want rewritten.

  5. (Not available for Character profiles) If your pages contain JavaScript, use the Additional Names to Search for URL Strings to Rewrite with Host Name section to specify JavaScript variables or methods. You can also add HTML attribute names. (For the list of attribute names that are automatically searched, see HTML Tags.)

    Fill in the following fields:

    Variable or Attribute Name to Search for Is: Lists the name of an HTML attribute or JavaScript variable to search to see if its value contains a URL string. Click New to add a name to the text box. To add multiple values, enter each value on a separate line.

    JavaScript Method to Search for Is: Lists the names of Java methods to search to see if their parameters contain a URL string. Click New to add a method to the text box. To add multiple values, enter each value on a separate line.

  6. Use the Additional Strings to Replace section to specify a string to search for and specify the text it should be replaced with. The search boundary (word or character) that you specified when creating the profile is used when searching for the string.

    To add a string, click New, then fill in the following:

    Search: Specify the string you want to search for. The profile type controls the matching and replacement rules. For more information, see one of the following:

    Replace With: Specify the string you want to use in place of the search string.

  7. Click OK.

  8. If you have more than one profile in the HTML Rewriter Profile List, use the up-arrow and down-arrow buttons to order the profiles.

    If you create more than one profile, order becomes important. For example if you want to rewrite all pages with a general rewriter profile (with a URL such as /*) and one specific set of pages with another rewriter profile (with a URL such as /doc/100506/*), you need to have the specific rewriter profile listed before the general rewriter profile.

    Even if multiple Word or Character profiles are enabled, a maximum of one Word profile and one Character profile is executed per page. The first Word profile and Character profile in the list that matches a page are executed, and the others are ignored.

  9. Enable the profiles you want to use for this protected resource. Select the profile, then click Enable.

    The default profile cannot be disabled. However, it is not executed if you have enabled another Word profile that matches your pages, and this profile comes before the default profile in the list.

  10. To save your changes to browser cache, click OK.

  11. To apply your changes, click the Access Gateways link, then click Update > OK.

  12. The cached pages affected by the rewriter changes must be updated on the Access Gateway. Do one of the following:

    • If the changes affect numerous pages, click Access Gateways, select the name of the server, then click Actions > Purge All Cache.

    • If the changes affect only a few pages, refresh or reload the page within the browser.

2.6.6 Disabling the Rewriter

There are three methods you can use to disable the internal rewriter:

Disabling per Proxy Service

By default, the rewriter is enabled for all proxy services. The rewriter can slow performance because of the parsing overhead. In some cases, a Web site might not have content with URL references that need to be rewritten. The rewriter can be disabled on the proxy service that protects that Web site.

  1. In the Administration Console, click Devices > Access Gateways > Edit > [Name of Reverse Proxy] > [Name of Proxy Service] > HTML Rewriting.

  2. Deselect the Enable HTML Rewriting option, then click OK.

  3. To apply your changes, click the Access Gateways link, then click Update > OK.

  4. Select the Access Gateway, then click Actions > Purge All Cache > OK.

Disabling per URL

You can also specify a list of URLs that are to be excluded from being rewritten for the selected proxy service.

  1. In the Administration Console, click Devices > Access Gateways > Edit > [Name of Reverse Proxy] > [Name of Proxy Service] > HTML Rewriting.

  2. Click the name of the Word profile defined for this proxy service.

    If you have not defined a custom Word profile for the proxy service, you might want to create one. If you modify the default profile, those changes are applied to all proxy services.

  3. In the And Requested URL Is Not section, click New, then specify the names of the URLs you do not want rewritten.

    Specify each URL on a separate line.

  4. Click OK twice.

  5. In the HTML Rewriter Profile List, make sure the profile you have modified is enabled and at the top of the list, then click OK.

  6. To apply your changes, click the Access Gateways link, then click Update > OK.

  7. Select the Access Gateway, then click Actions > Purge All Cache > OK.

Disabling with Page Modifications

There are cases when the URLs in only part of a page or in some of the JavaScript or form can be rewritten and the rest should not be rewritten. When this is the case, you might need to modify the content on the Web server. Although this deviates from the design behind Access Manager, you might encounter circumstances where it cannot be avoided.

You can add the following types of tags to the pages on the Web server:

These tags are seen by browsers as a comment mark, and do not show up on the screen (except possibly on older browser versions).

NOTE:If the pages you modify are cached on the Access Gateway, you need to purge the cache before the changes become effective. Click Access Gateways, select the name of the server, then click Actions > Purge All Cache

Page Tags: If you want only portions of a page rewritten, you can add the following tags to the page.

<!--NOVELL_REWRITER_OFF--> 
.
.
HTML data not to be rewritten
.
.
<!--NOVELL_REWRITER_ON-->

The last tag is optional, and if omitted, it prevents the rest of the page from being rewritten after the <!--NOVELL_REWRITER_OFF--> tag is encountered.

Param Tags: Sometimes the JavaScript on the page contains <param> elements that contain a value attribute with a URL. You can enable global rewriting of this attribute by adding value to the list of variable and attribute names to search for. If you need more control because some URLs need to be rewritten but others cannot be rewritten, you can turn on and turn off the value rewriting by adding the following tags before and after the <param> element in the JavaScript.

<!--NOVELL_REWRITE_ATTRIBUTE_ON='value'-->
.
.
<param> elements to be rewritten
.
.
<!--NOVELL_REWRITE_ATTRIBUTE_OFF='value'-->
.
.
<param> elements that shouldn’t be rewritten

Form Tags: Some applications have forms in which the <input>, <button>, and <option> elements contain a value attribute with a URL. You can enable global rewriting of these attributes by adding formvalue to the list of variable and attribute names to search for. If you need more control because some URLs need to be rewritten but others cannot be rewritten, you can turn on and turn off the formvalue rewriting by adding the following tags before and after the <input>, <button>, and <option> elements in the form.

<!--NOVELL_REWRITE_ATTRIBUTE_ON='formvalue'-->
.
.
<input>, <button>, and <option> elements to be rewritten
.
.
<!--NOVELL_REWRITE_ATTRIBUTE_OFF='formvalue'-->
.
.
<input>, <button>, and <option> elements that shouldn’t be rewritten