October 21, 2020
This post represents an effort to find various filetypes that might exit against this 597 strong Apple subdomains list that was recently found on a random website. Unfortunately the original URL hosting the list was not documented.
Irregardless, a dry run was executed against all 597 domains using the wayback_machine_downloader
with the -l
option switched on which prevents any downloads from occurring. While the script run the stdout was logged to a text file to document the process. This resulted in a 1.05GB log file. Although the script was only out to curate pkg, as, hqx, cpt, bin, sea, sit, sitx, dd, and pit files, it’s clear that many other filetypes were documented in the log file.
This block of Apple subdomains is the source with which the wayback_machine_downloader
will be applied to curate (download) files. Some of the subdomains may have never been captured by the Wayback Machine for a variety of reasons.
apple-wwnet.apple.com, nwk-www.apple.com, nwk-search.apple.com, nwk-qtannounce.apple.com, nwk-serialno.apple.com, nwk-qtinstall.apple.com, awpicts.apple.com, nwk-stream.qtv.apple.com, nwk-qttest.apple.com, nwk-qtsoftware.apple.com, helpqt.apple.com, helposx.apple.com, help.apple.com, ssl.apple.com, nwk-redirect.apple.com, livepage.apple.com, instore.apple.com, applebrasil.com, .br asia-red.apple.com, www.uk.apple.com, speech.apple.com, nwk-qtinstall.apple.com, nwk-qtpix.apple.com, nwk-www.support.apple.com, nwk-groups.mac.com, apple-order1.apple.com, ink.apple.com, ipod.com, itunes.com, rss.itunes.com, nwk-neo.qtv.apple.com, apple-darwin.com, nwk-hompage.mac.com, nwk-prohelp.apple.com, nwk-documentation.apple.com, bz-si-a.apple.com, bz-si-b.apple.com, fdns1.apple.com, fdns2.apple.com, time0.apple.com, time1.apple.com, time2.apple.com, time.apple.com, fdns5.apple.com, fdns6.apple.com, smtpin01.universityarts.com, mediaport01.apple.com, mediaport02.apple.com, qtannounce.apple.com, debit.apple.com, nsagate.apple.com, nserver.apple.com, mail-out2.apple.com, mail-out1.apple.com, forum.apple.com, mail-in1.apple.com, mail-in2.apple.com, nserver2.apple.com, littlebuddy.apple.com, connect1.apple.com, connect2.apple.com, scv-wsidecar.apple.com .0.254.17.in-addr.arpa, console1.apple.com, console2.apple.com, ray.apple.com, jay.apple.com, littlebuddyx.apple.com, wwb1.apple.com, wwb2.apple.com, mycroft.apple.com, tmpforum.apple.com, www.apple.com, bz1.apple.com, bz2.apple.com, tondero.apple.com, piglet.apple.com, netcache11-e0.apple.com, www.apple.com, netcache11-e5.apple.com, webmail.apple.com, qtbetastreamer1.apple.com, nosleep.apple.com, macos.apple.com, cab.apple.com, hwseeding.apple.com, developer1.apple.com, skippy.apple.com, developer2.apple.com, acrux-o.apple.com, spica.apple.com, pariah.apple.com, broadcaster.apple.com, cay.apple.com, asedb.apple.com, cay01.apple.com, cay02.apple.com, velocity.apple.com, internet1.apple.com, unihan.unicode.org lists.apple.com, applenews.lists.apple.com, craft.training.apple.com, search.apple.com, qtconfig.apple.com, qtconfig2.apple.com, qtd.apple.com, qtconfig.apple.com, netcache12-e0.apple.com, ext-stage.apple.com, qtstage.apple.com, qtstage2.apple.com, upload.qtv.apple.com, apple.com, ftpdev01.apple.com, ftpdev02.apple.com, campreg.apple.com, rns1.apple.com, rns1.apple.com, enfuego.apple.com, developer3.apple.com, developer4.apple.com, help1.apple.com, help2.apple.com, help3.apple.com, charple.apple.com, unsure.apple.com, sure.apple.com, sure01.apple.com, sure02.apple.com, rdns1.apple.com, ftp.qtv.apple.com, neo.qtv.apple.com, sfxf1.apple.com, sfxf2.apple.com, sfxm.apple.com, sfxm1.apple.com, sfxm2.apple.com, sfxm3.apple.com, sfx.apple.com, sfxt.apple.com, imercury2.apple.com, imercury3.apple.com, spica1.apple.com, spica2.apple.com, instanton.apple.com, gidget160.apple.com, gidget161.apple.com, gsearch.apple.com, xlate.apple.com, xlate1.apple.com, learnandearn.apple.com, signin.apple.com, devcon.apple.com, changeinfo.apple.com, register.apple.com, retail.apple.com, asw.apple.com, asw2.apple.com, aswf.apple.com, aswf2.apple.com, daw.apple.com, dawws01.apple.com, dawws02.apple.com, truecomp.apple.com, b2bstore.appple.com, education.apple.com, crt.apple.com, train.apple.com, bugreport.apple.com, itsa.apple.com, itse.apple.com, arthurwmml.apple.com, krazy.apple.com, kool.apple.com, topcat.apple.com, fritz.apple.com, mxwm.apple.com, seacoast.apple.com, etsdev1wm.apple.com, micronwm.apple.com, wmdev.apple.com, gra1.apple.com, gra2.apple.com, gcrm.apple.com, casis.apple.com, epp.apple.com, swupdate.apple.com, update10.apple.com, update11.apple.com, gra.apple.com, testswupdate.apple.com, webnfaws.apple.com, connect.apple.com, register-tmp.apple.com, stream.apple.akadns.net, stream.apple.akadns.net, stream.apple.akadns.net, stream.apple.akadns.net, stream.apple.akadns.net, stream.apple.akadns.net, apply-euro.apple.com, developer.apple.com, webgdv.apple.com, radarsubmissions.apple.com, scv-ipodrocks.apple.com, myinfo.apple.com, naswws.apple.com, chatbox-web1.apple.com, chatbox-web2.apple.com, milhousewm.apple.com, margewmt.apple.com, margewmm.apple.com, arthurwm.apple.com, arthurwmfw.apple.com, store7.apple.com, vs6-1.apple.com, vs6-2.apple.com, vs7-1.apple.com, vs7-2.apple.com, jobs.apple.com, search.lists.apple.com, iforgot.apple.com, aristo.apple.com, maosh.apple.com, salestraining.apple.com, gsx.apple.com, scv-searchcgi.apple.com, caesar.apple.com, wdg1.apple.com, wdg2.apple.com, jobsws1.apple.com, jobsws2.apple.com, adcsearch.apple.com, adcweb.apple.com, gcrmfg.apple.com, wdb.apple.com, monterey.apple.com, portland.apple.com, search-fr.apple.com, search-de.apple.com, search-nl.apple.com, search-no.apple.com, search-it.apple.com, search-es.apple.com, search-dk.apple.com, search-se.apple.com, search-fi.apple.com, chatbox-web.apple.com, t2castro.apple.com, t2cool.apple.com, t2valenn.apple.com, t2zathras.apple.com, t2submit.apple.com, t2guidejp.apple.com, nugget.apple.com, t2nugget.apple.com, eureka.apple.com, t2eureka.apple.com, guidejp2.apple.com, t2guidejp2.apple.com, xsurvey.apple.com, t2higgs.apple.com, higgs.apple.com, t2perkins.apple.com, perkins.apple.com, adcweb1.apple.com, adcweb2.apple.com, adcweb3.apple.com, adcweb4.apple.com, mp4reg.apple.com, designawardst.apple.com, trainingsun1.apple.com, trainingsun2.apple.com, trainingsun3.apple.com, trainingsun4.apple.com, trainingsun5.apple.com, trainingsun6.apple.com, trainingsun7.apple.com, trainingsun8.apple.com, trainingsun9.apple.com, dvdsp.apple.com, qtsoftware.apple.com, helpqt.apple.com, helposx.apple.com, help.apple.com, guide-n.apple.com, guide1-n.apple.com, guide2-n.apple.com, guidejp-n.apple.com, guidejp1-n.apple.com, guidejp2-n.apple.com, applemusic.com, vs5-2.apple.com, www.access.apple.com, store.apple.com, connect6.apple.com, connect7.apple.com, store6.apple.com, itunes.com, www.anat.apple.com, bananajr6000.apple.com, devseed.apple.com, sherlockdevelopert.apple.com, rodwm.apple.com, gdv.apple.com, connect3.apple.com, extlor.apple.com, appleorder.apple.com, www.applereg.com, sonarproxy.apple.com, rfaproxy.apple.com, topfritz.apple.com, customer.apple.com, radarproxy.apple.com, store8.apple.com, red2.apple.com, learnandearn1.apple.com, learnandearn2.apple.com, ipod.com, icards.mac.com, supplier.apple.com, www.latinamerica.apple.com, kangaroo.apple.com, websabc.apple.com, spider.apple.com, certifications.apple.com, war.apple.com, jobsat.apple.com, backstage.apple.com, webobjects.apple.com, survey.apple.com, spssprod.apple.com, survey.edu-research.com, redirect.apple.com, red1.apple.com .3.254.17.in-addr.arpa, adcstudentst.apple.com, firewire.apple.com, mirror.apple.com, hound.apple.com, veloxs.apple.com, ws7.apple.com, ws8.apple.com, iss.apple.com, ppclinux.apple.com, powerbook.apple.com, servers.apple.com, rss.itunes.com, xerxes.apple.com, devcd.apple.com, qtpartners.apple.com, appletalkback.apple.com, markets.apple.com, campusreps2.apple.com, salesready.apple.com, campusreps3.apple.com, ws1.quicktime.apple.com, ws2.quicktime.apple.com, campusreps4.apple.com, reg.devworld.apple.com, survey.devworld.apple.com, b2bpbzml.apple.com, speech.apple.com, cgi2.training.apple.com, qtj.apple.com, highered.apple.com, campusreps5.apple.com, campusreps6.apple.com, www.applenet.apple.com, seeding.apple.com, qtinstall.apple.com, www.apple.fr, arthurlwm.apple.com, galileo.apple.com, www.apple.nl, www.apple.de, galileo2.apple.com, www.appledesigns.com, www.apple.at, adc-kbaset.apple.com, www.apple.ch, www.apple.be, www.apple.co.jp prismo.com, www.apple.se, www.apple.no, www.apple.fi, applejava.apple.com, www.applestore.ch, www.apple.dk, www.apple.it, www.apple.za, www.apple.es, www.uk.apple.com, cgis.training.apple.com, cgi5.training.apple.com, applescript.apple.com, hypercard.apple.com, wwc1.apple.com, wwc2.apple.com, wwc3.apple.com, awpicts.apple.com, apple.com, www.euro.apple.com, www.apple.ca, stories.apple.com, colorsync.apple.com, www.dueval.apple.com, stories2.apple.com, clarisftp.apple.com, slg.apple.com, qtpix.apple.com, resellerapplication.apple.com, livepage.apple.com, ra.apple.com, stories3.apple.com, azis.apple.com, qtvr20test.apple.com, var.apple.com, clarisworks.apple.com, applemasters01.apple.com, adc10.apple.com, software.apple.com, www.support.apple.com, survey1.apple.com, unicode2.apple.com, stream.qtv.apple.com, qttest.apple.com, qtpartners.apple.com, www.homepage.mac.com, cgi4.training.apple.com, apple-order1.apple.com, apple-order1.apple.com, apple-order1.apple.com, apple-order1.apple.com, federal.apple.com, fonts.apple.com, red-train.apple.com, quicktimelive.com, product.info.apple.com, product.info.apple.com, product.info.apple.com, product.info.apple.com, www0.info.apple.com, www0.info.apple.com, kidsafe.apple.com, mp4ra.org, spruceuserforums.apple.com, sprucesupport.apple.com, spruceregistration.com, www.spruce-tech.com, www.webdvd.org, shake.apple.com, chatbox-smtp-in.apple.com, chatbox-smtp-out1.apple.com, chatbox-smtp-out2.apple.com, chatbox-smtp-out3.apple.com, chatbox-smtp-out4.apple.com, chatbox-smtp-out5.apple.com, chatbox-smtp-out6.apple.com, gateway221.apple.com, ex0-stephenbz-221.apple.com, ex1-stephenbz-221.apple.com, engbz-bi-3a-221.apple.com, engbz-bi-3b-221.apple.com, spsi1.apple.com, spsi2.apple.com, sp-bi-4ka.apple.com, sp-bi-4kb.apple.com, adcevents.apple.com, adcinvoice.apple.com, adc-kbase.apple.com, adckitchen.apple.com, adcrequest.apple.com, adcstudents.apple.com, adctv.apple.com, designawards.apple.com, sherlockdeveloper.apple.com, techtalk.apple.com, wwdcexhibitors.apple.com, wwdcvolunteers.apple.com, guide.apple.com, hotdeals.apple.com, mpgsubmit.apple.com, mpgmonitor.apple.com, mpgjp1.apple.com, mpgjp2.apple.com, adcpdub01.apple.com, adcpdub02.apple.com, adcpdub03.apple.com, adcpdub04.apple.com, adcpdub05.apple.com, adcpdub06.apple.com, adcftp1.apple.com, adcftp2.apple.com, adcftp3.apple.com, plugfest.apple.com, gateway223.apple.com, ex0-mail-bz-223.apple.com, ex1-mail-bz-223.apple.com, mail-in5.apple.com, mail-in4.apple.com, mail-in3.apple.com, mail-out3.apple.com, mail-out4.apple.com, bz3.apple.com, bz4.apple.com, gateway226.apple.com, ex0-efax-226.apple.com, ex1-efax-226.apple.com, embrun.apple.com, downey.apple.com, compton.apple.com, cassis.apple.com, hobart.apple.com, rostock.apple.com, tre.apple.com, ton.apple.com, ftp01.apple.com, ftp02.apple.com, ftp03.apple.com, privftp.apple.com, wc1.apple.com, wcstream.apple.com, wchttp.apple.com, lfcomponents.apple.com, canadaapp.apple.com, jobtasks.apple.com, mirror.apple.com, odenya.apple.com, fcptrain.apple.com, www.opensource.apple.com, anoncvs.opensource.apple.com, clearmater.apple.com, wwmarkets.apple.com, foup.apple.com, appleseed.apple.com, discuss.appleseed.apple.com, appleseedwo.apple.com, seedsurvey.apple.com, dm.apple.com, seoext.apple.com, tcext.apple.com, gdext.apple.com, unihan.unicode.org spazmozart.apple.com, campusreps.apple.com, educationstreaming.apple.com, loc.apple.com, index-backup.apple.com, locsubmit.apple.com, talkies.apple.com, communityloc.apple.com, communitydev.apple.com, dts.ftp.apple.com, electron.apple.com, www.seminars.apple.com, www2.seminars.apple.com, www3.seminars.apple.com, db1.seminars.apple.com, db3.seminars.apple.com, list.seminars.apple.com, consultants.apple.com, consultants2.apple.com, streaming.seminars.apple.com, agents.apple.com, programs.apple.com, edsol.apple.com, alibak1.apple.com, newali.apple.com, alimed1.apple.com, oldali.apple.com, alistr1.apple.com, alistg1.apple.com, homeroom.apple.com, uldemo.apple.com, qtamigos.apple.com, qtdevseed.apple.com, qtpartners.apple.com, qtvr20test.apple.com, qtsdemo.apple.com, atis.training.apple.com, cgidb.training.apple.com, cgi6.training.apple.com, cgi3.training.apple.com, benedict.apple.com, alpha.ns.apple.com, beta.ns.apple.com, gamma.ns.apple.com, as.apple.com, alink-gw.apple.com, www.apple.com, www2.apple.com, www3.apple.com, webmaster.apple.com, bananajr6000.apple.com, skunkworks.apple.com, macip.apple.com
Before curation, the wayback_machine_downloader was employed to sample the block of subdomains and collect a list of every directory available for each subdomain. This process was logged, generating a 2.52GB log text file. The following shell script was used to generate the said log file. The -l
option prevented downloads.
while read -r line
do
wayback_machine_downloader -l -d "$line"_IA -c6 --only "/\.(pkg|as|hqx|cpt|bin|sea|sit|sitx|dd|pit)$/i" "$line"
done < /path/to/domains.txt | tee -a log.txt
Option Short Form | Option Unabbreviated | Description |
---|---|---|
-l | –list | Only list file urls in a JSON format with the archived timestamps, won’t download anything |
-d | –directory PATH | Directory to save the downloaded files into |
-c | –concurrency NUMBER | Number of multiple files to download at a time |
-s was omitted because it forces every timestamp to download into its own directory which requires a substantial amount of time to post-process. See this github issue for more.
The Finder window on the left shows an ongoing download with the -s flag omitted. On the other hand, applying the -s flag results in a very complex directory tree that packages each snapshot into their own timestamp folder. For the purpose of this project packaging into timestamp folders is not favorable.
Scrolling through the dry run log.txt file some familiar filetypes were found but it quickly became obvious that going through a 2.5GB file manual would be both an impractical and very lengthy process.
.cwk
, CWK is a document created by Apple’s AppleWorks or ClarisWorks productivity suite..vtt
, is related to the WebVTT (Web Video Text Tracks Format), a text data format used to store subtitles, chapters, descriptions and other metadata..mp4
, MPEG-4 Part 14, a multimedia file commonly used to store a movie or video clip. It may also contain subtitles or images..usdz
, A USDZ file contains a 3D scene or object saved in the USDZ Universal format, which is developed by Apple and Pixar Animation Studios. It is an uncompressed and unencrypted . ZIP archive that stores a Universal Scene Description (. USD, USDA, or USDC) file, which includes 3D geometry and shading data..mov
, MOV is another type of digital container file for videos and other multimedia. MOV is also known as QuickTime File Format, or QTFF. Apple developed MOV for use with the Apple QuickTime Player..mp3
, An MP3 file is an audio file saved in a compressed audio format developed by the Moving Picture Experts Group (MPEG) that uses “Layer 3” audio compression (MP3)..ps
, Adobe PostScript file format was developed by Adobe in 1982. This postscript file format is widely used by publishers primarily for printing purposes. PS files contain text and images on the same page..svg
, An SVG file is a graphics file that uses a two-dimensional vector graphic format created by the World Wide Web Consortium (W3C)..cab
, Cabinet is an archive-file format for Microsoft Windows that supports lossless data compression and embedded digital certificates used for maintaining archive integrity. Cabinet files have filename extensions and are recognized by their first 4 bytes MSCF..ps7
, A P7S file is an email message that includes a digital signature. It can be used for sending secure emails that can only be viewed by the intended recipient. P7S files verify that the email is from who it claims to be from and that the email has not been modified in transit..pgp
, Security key or digital signature file that verifies a user’s identity; used for decrypting a file encrypted with Pretty Good Privacy (PGP) software; ensures that protected files can only be opened by authorized users..tif
, TIF is an image format file for high-quality graphics. TIF files are also called . TIFF, which stands for “Tagged Image Format File.” TIF files were created in the 1986 as a file format for scanned images in an attempt to get all companies to use one standard file format instead of multiple..json
, A JSON file is a file that stores simple data structures and objects in JavaScript Object Notation (JSON) format, which is a standard data interchange format..dmg
, A DMG file is a mountable disk image used to distribute software to the macOS operating system..tar.gz
, is a combination of TAR packaging followed by a GNU zip (gzip) compression. It is commonly used in Unix based operating systems..zip
, an archive file format that supports lossless data compression. A ZIP file may contain one or more files or directories that may have been compressed.The following four shell command were considered in an attempt to isolate file extensions scatterde throughout the log.txt file but due to the size of the file, isolation was not without its challenges. Each of these commands presented different results. The first command was rather effective, trimming most of the noise out of the file while the fourth was much more conservative, requiring more post manual editing than the first.
sed -E 's/^.*(\.[^\.]+)$/\1/' logfile.txt | sort | uniq -c
grep 'http://.*[^/]$' logfile.txt | awk -F/ '{print $NF}' | sed -E 's/^.*(\.[^\.]+)$/\1/' | sort | uniq -c
grep 'http://.*[^/]$' logfile.txt | awk -F/ '{print $NF}' | cut -d. -f2 | sort | uniq -c
grep 'http://.*[^/]$' logfile.txt | grep -v '/?' | awk -F/ '{print $NF}' | sed -E 's/^.*(\.[^\.]+)$/\1/' | sort | uniq -c
The script considered only certain filetypes for curation. As mentioned earlier, other filetypes could be curated, notably PDF along with text files but the latter could result in many robots.txt files travelling inbound.
At first glance the robots.txt file in itself may appear to be of little interest but sometimes they reveal interesting subdomains that are ignored by spiders, misc. notes and other tidbits of data. The internet archive’s waybackmachine dutifully ignores the contents of robots.txt files.
This command walks (traverses) the working directory tree and provides the total number of files while ignoring .DS_Store files.
find . ! -name '.DS_Store' ! -type d | wc -l
This command traverses the working directory tree, find all filetypes, with the exception of .DS_Store files, and displays the number of files found for each filetype.
find . ! -name '.DS_Store' ! -type d|sed -E 's/^.*(\.[^\.]+)$/\1/'|sort|uniq -c
This command works precisely as the one above by identifying the numebr of unique filetypes and which directories they are found in. This command is responsible for the data in the table below.
for d in *; do echo "$d"; find "$d" ! -name .DS_Store ! -type d|sed -E 's/^.*(\.[^\.]+)$/\1/'|sort|uniq -c; done
Some of the domains in the original block that files were curated against are not listed in the table bleow because when the Wayback Machine probed the domain the object no longer existed on the server, the link followed was either outdated, inaccurate, or a robots.txt file set a prohibition. In some cases a (40x and 50x) or redirections (30x) may have caused the unavailabilty.
The first run was restricted to the following common Macintosh filetypes
Domain - Macintosh filetypes | .as | .bin | .hqx | .pkg | .sit | .sea | .sitx | .cpt | |
---|---|---|---|---|---|---|---|---|---|
agents.apple.com | 2 | ||||||||
apple-darwin.com | 2 | ||||||||
applescript.apple.com | 37 | 8 | |||||||
appleseed.apple.com | 5 | ||||||||
bananajr6000.apple.com | 11 | 2 | |||||||
canadaapp.apple.com | 2 | ||||||||
colorsync.apple.com | 8 | 11 | 25 | ||||||
consultants.apple.com | 17 | ||||||||
daw.apple.com | 401 | ||||||||
developer.apple.com | 23 | 260 | 9115 | 21 | |||||
developer1.apple.com | 11 | ||||||||
developer2.apple.com | 16 | ||||||||
education.apple.com | 152 | ||||||||
fonts.apple.com | 40 | 1 | 1 | ||||||
galileo.apple.com | 4 | ||||||||
guide.apple.com | 1 | 4 | |||||||
help.apple.com | 28 | ||||||||
hotdeals.apple.com | 61 | ||||||||
hypercard.apple.com | 2 | ||||||||
lists.apple.com | 138 | 11 | |||||||
macos.apple.com | 3 | 5 | |||||||
mirror.apple.com | 722 | 31974 | 20 | 2 | 1 | 5 | |||
newali.apple.com | 51 | 448 | 63 | ||||||
oldali.apple.com | 23 | 6 | |||||||
ppclinux.apple.com | 7 | ||||||||
product.info.apple.com | 1 | 50 | |||||||
programs.apple.com | 4 | ||||||||
qtj.apple.com | 3 | ||||||||
register.apple.com | 5 | ||||||||
resellerapplication.apple.com | 5 | ||||||||
retail.apple.com | 18 | 2 | |||||||
slg.apple.com | 1 | ||||||||
speech.apple.com | 30 | 2 | |||||||
ssl.apple.com | 206 | ||||||||
train.apple.com | 182 | ||||||||
ws2.quicktime.apple.com | 4 | ||||||||
www.apple.at | 3 | ||||||||
www.apple.be | 3 | 46 | |||||||
www.apple.ca | 5 | 192 | 2 | ||||||
www.apple.ch | 3 | ||||||||
www.apple.co.jp | 20 | 19 | 450 | 1 | |||||
www.apple.de | 75 | ||||||||
www.apple.dk | 7 | ||||||||
www.apple.es | 3 | 4 | |||||||
www.apple.fr | 4 | ||||||||
www.apple.it | 3 | ||||||||
www.apple.nl | 10 | ||||||||
www.apple.no | 15 | ||||||||
www.apple.se | 33 | ||||||||
www.euro.apple.com | 8 | 138 | |||||||
www.homepage.mac.com | 275 | 238 | 6218 | 2 | 539 | 12 | 64 | ||
www.latinamerica.apple.com | 3 | 1 | |||||||
www.opensource.apple.com | 11 | 26 | 84 | 783 | 1 | 1 | |||
www.seminars.apple.com | 82 | ||||||||
www.spruce-tech.com | 12 | ||||||||
www.support.apple.com | 1 | 14 | |||||||
www.webdvd.org | 3 | ||||||||
www0.info.apple.com | 9 | 893 | 1 | ||||||
www2.apple.com | 1 | ||||||||
www2.seminars.apple.com | 82 | ||||||||
www3.apple.com | 1 | ||||||||
www3.seminars.apple.com | 82 |
The second run was restricted to the following disk image filetypes
(/\.(dmg|img|smi)$/i)
Domain - Macintosh images | .as | .bin | .hqx | .pkg | .sit | .sea | .sitx | .cpt | |
---|---|---|---|---|---|---|---|---|---|
agents.apple.com | 2 |
The third run was restricted to txt filetypes; robots.txt files were isolated from other .txt files.
(/\.(txt|)$/i)
As files travel inbound some 734 sit, hqx, cpt.bin, bin, sea and other archive filetypes already populate their destination folder and these are from the now defunct homepage.mac.com domain alone. It wouldn’t be too far fetched to assume that these files in some cases might represent developer works in which case documentation should also be considered as accomanying documents or the obscure files might not have perspective. Some files from the domain will be familiar to some; RasterOps_3.2.1.cpt.bin, RadiusWare3.4.1.cpt.bin, Halo_1.05.3_Updater.sit, Halo_1.03_Updater.sit, Glider9.sit, and the list goes on.
The following file formats have been retrieved as a first step. Normally filetypes outside those listed below would be considered but considering that these are Apple subdomains it could yield some interesting results.
pkg|as|hqx|cpt|bin|sea|sit|sitx|dd|pit|pdf