Download All Podcasts from A Html Page (Conversations with History)

Download All Podcasts from A Html Page

(Cross-Post from main website)

I love the excellent uctv TV program 'conversation with history', with Harry Kreisler (podcast url). I like it so much that I want to listen to them all. The problem is that I don't want to bother downloading them one by one to my mp3 player. The webpage labels the .mp3 inconveniently as non-consecutive numbers (ex: 72365.mp3). It makes it even harder to figure out if I listened to one of the downloaded files.

This is a script that fetches the podcast's url, downloads all the mp3 and names the files after the podcasts' title (ex: 'Legislating for the People, with Ronald V. Dellums.mp3').

Run the Scrip

python3 download_cwh.py


INFO: Starting at 2011-01-02 17:38
DEBUG: Fetching http://podcast.uctv.tv/mp3/20378.mp3, writing to: 0001_islam, identity, and globalization with tariq ramadan.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/19846.mp3, writing to: 0002_henry kaplan and the story of hodgkin's disease.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/19856.mp3, writing to: 0003_america's path to permanent war.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/19488.mp3, writing to: 0004_reforming american health care.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/19602.mp3, writing to: 0005_the bp disaster - lessons from the niger delta.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/19332.mp3, writing to: 0006_political awakenings.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/19331.mp3, writing to: 0007_science diplomacy and nuclear threats.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/19197.mp3, writing to: 0008_nuclear proliferation with ambassador gregory l. schulte.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/18709.mp3, writing to: 0009_from salvation to spirituality.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/18527.mp3, writing to: 0010_reflections on u.s.- canada relations.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/18541.mp3, writing to: 0011_reflections on the university of california.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/18526.mp3, writing to: 0012_islam and the secular state.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/18336.mp3, writing to: 0013_the modern presidency and the national security state with garry wills.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/18375.mp3, writing to: 0014_the making of a marine officer.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/18192.mp3, writing to: 0015_american democracy, veterans, and higher education.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/18152.mp3, writing to: 0016_what made california great.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/18124.mp3, writing to: 0017_what happens when other countries have the money.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/17867.mp3, writing to: 0018_leadership in higher education with hanna holborn gray.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/17979.mp3, writing to: 0019_the grand strategy of the byzantine empire with edward n. luttwak.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/17870.mp3, writing to: 0020_the diaspora and israel.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/17787.mp3, writing to: 0021_finding an authentic voice.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/17602.mp3, writing to: 0022_nuclear weapons and international conflict.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/17601.mp3, writing to: 0023_a life in science: a sense of wonder.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/17429.mp3, writing to: 0024_u.s. policy toward iran: problems and prospects.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/17113.mp3, writing to: 0025_dealing with iran.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/16935.mp3, writing to: 0026_reaching for the stars.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/16749.mp3, writing to: 0027_social science and the public good.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/16463.mp3, writing to: 0028_power, ideas and foreign policy in the 21st century.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/16593.mp3, writing to: 0029_dignity, human rights, and torture.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/16750.mp3, writing to: 0030_the red cross report, the torture memos, and political accountability with mark danner.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/16456.mp3, writing to: 0031_building a multilateral international order.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/16455.mp3, writing to: 0032_a microbiologist’s intellectual odyssey.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/16462.mp3, writing to: 0033_judges and the rule of law.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/16461.mp3, writing to: 0034_identity with john perry.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/16278.mp3, writing to: 0035_the politics of the veil.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/16279.mp3, writing to: 0036_nuclear power and the challenges of global climate change and nuclear proliferation.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/16404.mp3, writing to: 0037_congress, globalization, and the economic crisis.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/16412.mp3, writing to: 0038_your inner fish with neil shubin.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/16283.mp3, writing to: 0039_identity, freedom, and revolution.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/16083.mp3, writing to: 0040_lessons from fdr's new deal.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/16225.mp3, writing to: 0041_causes and consequences of the global economic collapse.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/15887.mp3, writing to: 0042_art and science.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/16055.mp3, writing to: 0043_historical perspective on the global economic crisis.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/16057.mp3, writing to: 0044_understanding the global environmental crisis.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/15882.mp3, writing to: 0045_the politics of food.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/15783.mp3, writing to: 0046_islam in the west with jocelyn cesari.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/15749.mp3, writing to: 0047_diplomacy with jeremy kinsman.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/15744.mp3, writing to: 0048_terrorism, immigration and security since 9/11.mp3
DEBUG: Error (<class 'IOError'>): [Errno 2] No such file or directory: '0048_terrorism, immigration and security since 9/11.mp3'
DEBUG: Fetching http://podcast.uctv.tv/mp3/15426.mp3, writing to: 0049_communication as a tool for european democracy.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/15607.mp3, writing to: 0050_global poverty, development, and social change.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/15745.mp3, writing to: 0051_the rumsfeld memo and the betrayal of american values.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/15591.mp3, writing to: 0052_natural capitalism.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/15590.mp3, writing to: 0053_abraham lincoln as commander in chief.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/15580.mp3, writing to: 0054_the ascent of money.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/15530.mp3, writing to: 0055_charting the geopolitics of a new century.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/15444.mp3, writing to: 0056_thinking about religion, secularism and politics.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/15411.mp3, writing to: 0057_american foreign policy from the end of the cold war to 9/11.mp3
DEBUG: Error (<class 'IOError'>): [Errno 2] No such file or directory: '0057_american foreign policy from the end of the cold war to 9/11.mp3'
DEBUG: Fetching http://podcast.uctv.tv/mp3/15414.mp3, writing to: 0058_pakistan.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/15413.mp3, writing to: 0059_china and the united states.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/15394.mp3, writing to: 0060_reflections on the supreme court.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/15229.mp3, writing to: 0061_how the war on terror turned into a war on american values.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/15135.mp3, writing to: 0062_descent into chaos.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/14828.mp3, writing to: 0063_what does china think?.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/14818.mp3, writing to: 0064_visualizing the relationship between structure and cellular activity.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/14820.mp3, writing to: 0065_terror and consent: the wars for the twenty-first century.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/14819.mp3, writing to: 0066_diplomacy and u.s. foreign policy.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/14669.mp3, writing to: 0067_biblical insights into the problem of suffering.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/14637.mp3, writing to: 0068_the power of words and the power over words.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/14632.mp3, writing to: 0069_a surgeon&rsquo;s journey beyond science.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/14865.mp3, writing to: 0070_addressing national security challenges in the post 911 world.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/14670.mp3, writing to: 0071_reflections on a life as scholar,teacher,and policy advisor.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/14633.mp3, writing to: 0072_capitalism, the environment, and crossing from crisis to sustainability.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/14510.mp3, writing to: 0073_global competition and the rise of the second world.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/14635.mp3, writing to: 0074_vice president cheney and america's response to 911.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/14634.mp3, writing to: 0075_afghanistan and pakistan.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/14500.mp3, writing to: 0076_u.s. foreign policy and the terrorist threat.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/14490.mp3, writing to: 0077_the military in the post 911 world.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/14478.mp3, writing to: 0078_the rise of asia and the decline of the west.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/14477.mp3, writing to: 0079_chasing the flame.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/14341.mp3, writing to: 0080_america&rsquo;s reckless response to terror.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/14340.mp3, writing to: 0081_why market reform succeeded and democracy failed in russia.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/14259.mp3, writing to: 0082_investigating military conduct at abu ghraib.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/14257.mp3, writing to: 0083_the military and political development in egypt, algeria, and turkey.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/14234.mp3, writing to: 0084_the shaping of a legal response to 911.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/13992.mp3, writing to: 0085_iran, israel, and the united states.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/13974.mp3, writing to: 0086_national security and the rule of law.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/13889.mp3, writing to: 0087_nuclear terrorism.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/13833.mp3, writing to: 0088_science and history.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/13832.mp3, writing to: 0089_science, government, and the university.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/13830.mp3, writing to: 0090_the moment of empire.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/13693.mp3, writing to: 0091_global capitalism, labor markets, and inequality.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/13692.mp3, writing to: 0092_system change or more of the same.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/13688.mp3, writing to: 0093_the imperial temptation of america.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/13687.mp3, writing to: 0094_britain and america and the making of the modern world.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/13662.mp3, writing to: 0095_economics, politics and public discourse.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/13665.mp3, writing to: 0096_iran - domestic politics and foreign policy.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/13555.mp3, writing to: 0097_what terrorists want.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/13568.mp3, writing to: 0098_domestic politics and international relations.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/13466.mp3, writing to: 0099_inside muslim militancy.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/13433.mp3, writing to: 0100_wealth, empire, and the future of america.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/13432.mp3, writing to: 0101_nationalism, cosmopolitanism and american national identity.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/13327.mp3, writing to: 0102_truth, power, and the iraq debacle with mark danner.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/13297.mp3, writing to: 0103_the jewish century with yuri slezkine.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/13290.mp3, writing to: 0104_business, government and ethics in an era of globalization with david vogel.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/13167.mp3, writing to: 0105_domestic politics and international behavior: the case of china and the u.s..mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/12356.mp3, writing to: 0106_freedom of expression, tolerance, and human rights with t.m. scanlon.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/13013.mp3, writing to: 0107_how traders, preachers, adventurers, and warriors shaped globalization with nayan chanda.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/12846.mp3, writing to: 0108_challenges for u.s. national security policy with general tony zinni.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/12824.mp3, writing to: 0109_confronting global terrorism: the elements of a liberal grand strategy with tom farer.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/12623.mp3, writing to: 0110_israel and the 1967 war with tom segev.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/12599.mp3, writing to: 0111_america, europe, and the islamic world with mark steyn.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/12490.mp3, writing to: 0112_law, politics, and the coming collapse of the middle class with elizabeth warren.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/12489.mp3, writing to: 0113_the last days of the american republic with chalmers johnson.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/12294.mp3, writing to: 0114_globalization and the conservative movement in the united states, with john micklethwait.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/12295.mp3, writing to: 0115_intuition and rationality with daniel kahneman.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/12243.mp3, writing to: 0116_globalization and islam, with olivier roy.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/12297.mp3, writing to: 0117_the emergence of the new china with john pomfret.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/12185.mp3, writing to: 0118_foreign correspondent - the middle east with robert  fisk.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/12117.mp3, writing to: 0119_al-qaeda and the road to 9/11, with lawrence wright.mp3
DEBUG: Error (<class 'IOError'>): [Errno 2] No such file or directory: '0119_al-qaeda and the road to 9/11, with lawrence wright.mp3'
DEBUG: Fetching http://podcast.uctv.tv/mp3/12102.mp3, writing to: 0120_reflections on empire, nationalism and globalization, with kenneth d. kaunda.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/12087.mp3, writing to: 0121_ethical realism and u.s. foreign policy, with anatole  lieven and john hulsman.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/12103.mp3, writing to: 0122_revolutions in military affairs and the war on terror, with max boot.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/12082.mp3, writing to: 0123_the war of the world, with niall ferguson.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/12061.mp3, writing to: 0124_a cosmologist&rsquo;s intellectual journey, with james e. peebles.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/11984.mp3, writing to: 0125_women's rights, religious freedom, and liberal education, with martha c. nussbaum.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/11335.mp3, writing to: 0126_meaning, relevance and the limits of technology, with hubert dreyfus.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/11738.mp3, writing to: 0127_larry brilliant.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/11794.mp3, writing to: 0128_the struggle for human rights in iran, with shirin  ebadi.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/11722.mp3, writing to: 0129_journalism in the digital age, with michael kinsley.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/11692.mp3, writing to: 0130_climate change and public policy, with lars-erik liljelund.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/11495.mp3, writing to: 0131_military victory in the information age, with stephen d. biddle.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/11493.mp3, writing to: 0132_thinking about the &ldquo;unthinkables&rdquo; in the post 911 world, with harold p smith, jr.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/11496.mp3, writing to: 0133_europe and the world, with the right honorable lord patten of barnes ch.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/11398.mp3, writing to: 0134_the transformation of american politics, with paul pierson.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/11244.mp3, writing to: 0135_science and society, with dudley herschbach.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/9165.mp3, writing to: 0136_the peace movement in historical perspective, with linus pauling.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/9510.mp3, writing to: 0137_on theory, with amartya sen.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/9511.mp3, writing to: 0138_the pentagon's new map, with thomas p.m. barnett.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/9322.mp3, writing to: 0139_economic history, with robert william fogel.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/9171.mp3, writing to: 0140_islam and the state, with vali nasr.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/8994.mp3, writing to: 0141_science and politics, with richard c. lewontin.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/8991.mp3, writing to: 0142_theory and international institutions, with robert o. keohane.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/8800.mp3, writing to: 0143_a geographer's perspective on the new american imperialism, with david harvey.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/8639.mp3, writing to: 0144_the myths of globalization: markets, democracy, and ethnic hatred, with amy chua.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/8634.mp3, writing to: 0145_occupation and terrorism, with amira hass.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/8848.mp3, writing to: 0146_a diplomat's odyssey, with joseph wilson.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/8642.mp3, writing to: 0147_a scientist's random walk, with steven chu.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/8641.mp3, writing to: 0148_militarism and the american empire, with chalmers johnson.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/7894.mp3, writing to: 0149_islam, empire, and the left, with tariq ali.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/7682.mp3, writing to: 0150_islam and the west, with john l. esposito.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/7390.mp3, writing to: 0151_u.s. foreign policy and the american political tradition, with walter russell mead.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/7386.mp3, writing to: 0152_theory, international politics, kenneth n. waltz.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/7388.mp3, writing to: 0153_islam and state power in middle east and central asia, with vitaly naumkin.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/7242.mp3, writing to: 0154_islamic societies, with ira lapidus.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/6797.mp3, writing to: 0155_writing, theatre arts, and political activism, with wole soyinka.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/6728.mp3, writing to: 0156_intelligence and national security in a democracy, jennifer e. sims.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/6726.mp3, writing to: 0157_u.s. foreign policy and multilateral negotiations, with robert l. gallucci.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/6604.mp3, writing to: 0158_the political imagination of islam, with olivier roy.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/6570.mp3, writing to: 0159_pakistan &amp; islamic fundamentalism, with khaled ahmed.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/6568.mp3, writing to: 0160_activism, anarchism, and power, with noam chomsky.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/6569.mp3, writing to: 0161_the rise of militant islam, ahmed rashid.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/6233.mp3, writing to: 0162_the case of trauma and recovery, psychological insight and political understanding, with judith herman.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/6046.mp3, writing to: 0163_adventures of a scientist, with charles w. townes.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/5217.mp3, writing to: 0164_legislating for the people, with ronald v. dellums.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/4975.mp3, writing to: 0165_art and healing, with kenzaburo oe.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/11289.mp3, writing to: 0166_ethics and foreign policy, with father j. bryan hehir.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/7126.mp3, writing to: 0167_intellectual journey:  challenging the conventional wisdom, with john kenneth galbraith.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/7127.mp3, writing to: 0168_reporting the story of  genocide, with philip gourevitch.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/7131.mp3, writing to: 0169_a life in public service, with robert s. mcnamara.mp3
DEBUG: Fetching http://podcast.uctv.tv/mp3/7796.mp3, writing to: 0170_philosophy and the habits of critical thinking, with john r. searle.mp3
INFO: Finishing at 2011-01-02 22:49


5 hours and 4GB later later the 170 'conversations with history' have been downloaded.

As it can be seen in the output, X files failed to be saved (IOError). This is simply because there was no validation of the filename (the '/' in 9/11 is illegal in a filename). This has been fixed by the substIllegalCharsInFilename() function. After introducing the function, I ran the script again. It downloaded the missing files whilst skipping the ones previously downloaded items.

The Code! (Also Check Download Section)

import os.path
import urllib.request
import logging
import time
# constants
LOG_FILENAME = 'download_cwh.log'
CACHE_FILE = 'uctv_cwh_htmlcache.html'
SRC_WEBSITE_CWH = 'http://www.uctv.tv/cwh/'
class ScriptLogHandler(logging.FileHandler):
  Save to file and output to screen
  def emit(self, record):
    print("{0}: {1}".format(record.levelname, record.getMessage()))
    logging.FileHandler.emit(self, record)
# configure logger
logger = logging.getLogger("script_logger")
log_handler = ScriptLogHandler("download_cwh.log")
class Got404WhileAttemptingToDlPodcast(Exception):
def substIllegalCharsInFilename(filename):
  allowedChars = """abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890_-,?.': """
  lstOutChars = []
  for char in filename:
    if char in allowedChars:
  return "".join(lstOutChars)
def getTimeNow():
  strTime = time.strftime("%Y-%m-%d %H:%S", time.localtime())
  return strTime
def getHtml():
  sock =  urllib.request.urlopen(SRC_WEBSITE_CWH)
  htmlSource = sock.read()
  return htmlSource.decode("utf-8")
def findTitleUrlTuples(htmlSource):
  curr_program_title = ''
  tag_beginTitle = 'Conversations With History:'.lower()
  tag_mp3 = 'Audio Podcast'.lower()
  lst_TitleUrlTuples = []
  for line in htmlSource.splitlines():
    # Ignore case
    line = line.lower()
    if line.find(tag_beginTitle) != -1:
      pre = line.split(tag_beginTitle, 1)[1]
      curr_program_title = pre.split('<',1)[0].strip()
    if line.find(tag_mp3) != -1 and line.find('.mp3'):
      mp3_url_posthttp          = line.split('http')[1]
      mp3_url_posthttp_premp3   = mp3_url_posthttp.split('.mp3', 1)[0]
      mp3_url = 'http' + mp3_url_posthttp_premp3 + '.mp3'
      lst_TitleUrlTuples.append((curr_program_title, mp3_url))
  return lst_TitleUrlTuples
def numTo4DigitsStr(iNum):
  strNum = str(iNum)
  while len(strNum) < 4:
    strNum = '0' + strNum
  return strNum
def downloadMp3sFromTitleUrlTuple(titleUrlTuple, fileNo):
  (title, mp3_url) = titleUrlTuple
  filename = "{0}_".format(numTo4DigitsStr(fileNo)) + title + '.mp3'
  filename = substIllegalCharsInFilename(filename)
  if not os.path.isfile(filename):
    logger.debug("Fetching {0}, writing to: {1}".format(mp3_url, filename))
    sock =  urllib.request.urlopen(mp3_url)
    mp3_bytes = sock.read()
    fh = open(filename, 'wb')
    logger.debug("Skipping {0}, file alredy in directory.".format(filename))
def downloadMp3sFromTitleUrlTuples(lst_TitleUrlTuples):
  fileNo = 0
  for titleUrlTuple in lst_TitleUrlTuples:
    fileNo = fileNo + 1
      downloadMp3sFromTitleUrlTuple(titleUrlTuple, fileNo)
    except Got404WhileAttemptingToDlPodcast as inst:
      logger.debug("Error (" + str(type(inst)) + "): " + str(inst))
    except Exception as inst:
      logger.debug("Error (" + str(type(inst)) + "): " + str(inst))
def do():
  htmlSource = getHtml()  
  lst_TitleUrlTuples = findTitleUrlTuples(htmlSource)
if __name__ == '__main__':
  logger.info("Starting at {0}".format(getTimeNow()))
  logger.info("Finishing at {0}".format(getTimeNow()))




It might have been faster to use the 'html.parser.HTMLParser' class and regexes in order to extract the files from the html page.

1 comment:

Anonymous said...

download_cwh.zip: file at: http://david-web.appspot.com/static/download_cwh.zip