Python：BeautifulSoup – 根据name属性获取属性值

我想打印一个属性值，根据它的名字，例如

<META NAME="City" content="Austin">

我想要做这样的事情

 soup = BeautifulSoup(f) //f is some HTML containing the above meta tag for meta_tag in soup('meta'): if meta_tag['name'] == 'City': print meta_tag['content']

上面的代码给了一个KeyError: 'name' ，我相信这是因为BeatifulSoup使用的名字，所以它不能被用作关键字参数。

这很简单，使用以下 –

 >>> soup = BeautifulSoup('<META NAME="City" content="Austin">') >>> soup.find("meta", {"name":"City"}) <meta name="City" content="Austin" /> >>> soup.find("meta", {"name":"City"})['content'] u'Austin'

如果有什么不清楚的地方留下评论。

最好的回答了这个问题，但这是另一种做同样的事情的方法。另外，在你的例子中，你在NAME中有大写字母，在你的代码中你有小写字母的名字。

 s = '<div class="question" id="get attrs" name="python" x="something">Hello World</div>' soup = BeautifulSoup(s) attributes_dictionary = soup.find('div').attrs print attributes_dictionary # prints: {'id': 'get attrs', 'x': 'something', 'class': ['question'], 'name': 'python'} print attributes_dictionary['class'][0] # prints: question print soup.find('div').get_text() # prints: Hello World

最好的答案是最好的解决scheme，但仅供参考您遇到的问题与美丽的汤中的标签对象的行为就像一个Python字典的事实。如果你在一个没有“name”属性的标签上访问标签['name']，你会得到一个KeyError。

以下工作：

 from bs4 import BeautifulSoup soup = BeautifulSoup('<META NAME="City" content="Austin">', 'html.parser') metas = soup.find_all("meta") for meta in metas: print meta.attrs['content'], meta.attrs['name']

也可以尝试这个解决scheme：

要查找写在表中的值

htmlContent

 <table> <tr> <th> ID </th> <th> Name </th> </tr> <tr> <td> <span name="spanId" class="spanclass">ID123</span> </td> <td> <span>Bonny</span> </td> </tr> </table>

Python代码

 soup = BeautifulSoup(htmlContent, "lxml") soup.prettify() tables = soup.find_all("table") for table in tables: storeValueRows = table.find_all("tr") thValue = storeValueRows[0].find_all("th")[0].string if (thValue == "ID"): # with this condition I am verifying that this html is correct, that I wanted. value = storeValueRows[1].find_all("span")[0].string value = value.strip() # storeValueRows[1] will represent <tr> tag of table located at first index and find_all("span")[0] will give me <span> tag and '.string' will give me value # value.strip() - will remove space from start and end of the string. # find using attribute : value = storeValueRows[1].find("span", {"name":"spanId"})['class'] print value # this will print spanclass

Python：BeautifulSoup – 根据name属性获取属性值

美丽的汤findAll没有find他们全部

我怎样才能从使用Python的HTML获得href链接？

屏幕抓取：绕过“HTTP错误403：robots.txt不允许的请求”

UnicodeEncodeError：'ascii'编解码器不能以特殊名称编码字符

BeautifulSoup抓住可见的网页文本

Python / BeautifulSoup – 如何从元素中删除所有标签？

BeautifulSoup在复合类名称search时返回空列表

使用BeautifulSoup删除标签，但保留其内容

BeautifulSoup和Scrapy爬虫之间的区别？

在pythonparsingHTML – lxml或BeautifulSoup？哪种更适合哪种用途？

Python：BeautifulSoup – 根据name属性获取属性值

美丽的汤findAll没有find他们全部

我怎样才能从使用Python的HTML获得href链接？

屏幕抓取：绕过“HTTP错误403：robots.txt不允许的请求”

UnicodeEncodeError：'ascii'编解码器不能以特殊名称编码字符

BeautifulSoup抓住可见的网页文本

Python / BeautifulSoup – 如何从元素中删除所有标签？

BeautifulSoup在复合类名称search时返回空列表

使用BeautifulSoup删除标签，但保留其内容

BeautifulSoup和Scrapy爬虫之间的区别？

在pythonparsingHTML – lxml或BeautifulSoup？ 哪种更适合哪种用途？

在pythonparsingHTML – lxml或BeautifulSoup？哪种更适合哪种用途？