Python:BeautifulSoup – 根据name属性获取属性值

我想打印一个属性值,根据它的名字,例如

<META NAME="City" content="Austin"> 

我想要做这样的事情

 soup = BeautifulSoup(f) //f is some HTML containing the above meta tag for meta_tag in soup('meta'): if meta_tag['name'] == 'City': print meta_tag['content'] 

上面的代码给了一个KeyError: 'name' ,我相信这是因为BeatifulSoup使用的名字,所以它不能被用作关键字参数。

这很简单,使用以下 –

 >>> soup = BeautifulSoup('<META NAME="City" content="Austin">') >>> soup.find("meta", {"name":"City"}) <meta name="City" content="Austin" /> >>> soup.find("meta", {"name":"City"})['content'] u'Austin' 

如果有什么不清楚的地方留下评论。

最好的回答了这个问题,但这是另一种做同样的事情的方法。 另外,在你的例子中,你在NAME中有大写字母,在你的代码中你有小写字母的名字。

 s = '<div class="question" id="get attrs" name="python" x="something">Hello World</div>' soup = BeautifulSoup(s) attributes_dictionary = soup.find('div').attrs print attributes_dictionary # prints: {'id': 'get attrs', 'x': 'something', 'class': ['question'], 'name': 'python'} print attributes_dictionary['class'][0] # prints: question print soup.find('div').get_text() # prints: Hello World 

最好的答案是最好的解决scheme,但仅供参考您遇到的问题与美丽的汤中的标签对象的行为就像一个Python字典的事实。 如果你在一个没有“name”属性的标签上访问标签['name'],你会得到一个KeyError。

以下工作:

 from bs4 import BeautifulSoup soup = BeautifulSoup('<META NAME="City" content="Austin">', 'html.parser') metas = soup.find_all("meta") for meta in metas: print meta.attrs['content'], meta.attrs['name'] 

也可以尝试这个解决scheme:

要查找写在表中的值

htmlContent


 <table> <tr> <th> ID </th> <th> Name </th> </tr> <tr> <td> <span name="spanId" class="spanclass">ID123</span> </td> <td> <span>Bonny</span> </td> </tr> </table> 

Python代码


 soup = BeautifulSoup(htmlContent, "lxml") soup.prettify() tables = soup.find_all("table") for table in tables: storeValueRows = table.find_all("tr") thValue = storeValueRows[0].find_all("th")[0].string if (thValue == "ID"): # with this condition I am verifying that this html is correct, that I wanted. value = storeValueRows[1].find_all("span")[0].string value = value.strip() # storeValueRows[1] will represent <tr> tag of table located at first index and find_all("span")[0] will give me <span> tag and '.string' will give me value # value.strip() - will remove space from start and end of the string. # find using attribute : value = storeValueRows[1].find("span", {"name":"spanId"})['class'] print value # this will print spanclass