Did I find the right examples for you? yes no      Crawl my project      Python Jobs

All Samples(2)  |  Call(1)  |  Derive(0)  |  Import(1)
It looks like the java version returns a lowercased path..
So why does it uses a case-insensitive regex? We won't lowercase here.

These doctests are from IAURLCanonicalizerTest.java:

    Check ASP_SESSIONID2:
    >>> stripPathSessionID("/(S(4hqa0555fwsecu455xqckv45))/mileg.aspx")
    '/mileg.aspx'

Check ASP_SESSIONID2 (again):(more...)

        def stripPathSessionID(path):
    """It looks like the java version returns a lowercased path..
    So why does it uses a case-insensitive regex? We won't lowercase here.

    These doctests are from IAURLCanonicalizerTest.java:

	Check ASP_SESSIONID2:
	>>> stripPathSessionID("/(S(4hqa0555fwsecu455xqckv45))/mileg.aspx")
	'/mileg.aspx'

    Check ASP_SESSIONID2 (again):
    >>> stripPathSessionID("/(4hqa0555fwsecu455xqckv45)/mileg.aspx")
    '/mileg.aspx'

    Check ASP_SESSIONID3:
    >>> stripPathSessionID("/(a(4hqa0555fwsecu455xqckv45)S(4hqa0555fwsecu455xqckv45)f(4hqa0555fwsecu455xqckv45))/mileg.aspx?page=sessionschedules")
    '/mileg.aspx?page=sessionschedules'

    '@' in path:
    >>> stripPathSessionID("/photos/36050182@N05/")
    '/photos/36050182@N05/'
    """
    patterns = [re.compile("^(.*/)(\((?:[a-z]\([0-9a-z]{24}\))+\)/)([^\?]+\.aspx.*)$", re.I),
                re.compile("^(.*/)(\\([0-9a-z]{24}\\)/)([^\\?]+\\.aspx.*)$", re.I),
               ]

    for pattern in patterns:
        m = pattern.match(path)
        if m:
            path = m.group(1) + m.group(3)

    return path
        


src/s/u/surt-0.2/surt/IAURLCanonicalizer.py   surt(Download)
import re
from handyurl import handyurl
from URLRegexTransformer import stripPathSessionID, stripQuerySessionID
 
# canonicalize()
            path = path.lower()
        if path_strip_session_id and path:
            path = stripPathSessionID(path)
        if path_strip_empty and '/' == path:
            path = None