Sunday, November 2, 2014

link parser in Python

Link Parser


import urllib.request

def parse_links(source):
    links=[]
    t=str(source).split('<a href="')
    for i in t:
        r=str(i).split('"')
        links.append(r[0])
    return links
        
    
def load_source(website):
    s=urllib.request.urlopen(website)
    v=s.read()
    return v
    
def main():
    search=input("Start:")
    source=load_source("http://nzz.ch")
    links=parse_links(source)
    for i in links:print(i)
    print("END")
    
main()


Author:Marcin
Language:Python 3.4
Infos: 
A simple Script for filtering out links of a website or just a html file. It's a simple parser that searches
for tags with links in it and save these.
Video: https://www.youtube.com/watch?v=mbFovXwFWn4

No comments:

Post a Comment