• [Dec 30, 2000, 5:52 pm] voot downtime

    voot (www154) crashed today under a heavy load, and was brought back online within 15 minutes.
  • [Dec 28, 2000, 9:09 pm] quan upgrade completed

    The drive upgrade of quan (www120) has been completed. It now has 30.7 GB of total disk space.
  • [Dec 28, 2000, 7:08 pm] vsael Downtime

    vsael (www173) crashed under heavy load. It has been returned to normal operation, and had a downtime of about 10 minutes.
  • [Dec 28, 2000, 3:33 pm] quan upgrade

    We are about to begin a drive upgrade on quan (www120) in order to improve performance and storage. There will be two brief periods of downtime, each approximately five minutes, at the beginning and end of the maintenance. The server will remain online at all other times. The entire upgrade could take up to 24 hours to complete. We will post an additional notice when the upgrade has been completed.
  • [Dec 28, 2000, 12:35 pm] jeta Downtime

    jern (www142) crashed under heavy load, and had to be manually rebooted. It was brought back up after a thorough disk cleaning. Total downtime was approximately 15 minutes.
  • [Dec 27, 2000, 9:38 am] berkano downtime

    berkano (www147) crashed under heavy load, and was brought back online with 10 minutes downtime.
  • [Dec 26, 2000, 9:31 am] vsael Downtime

    vsael (www173) crashed today under heavy load, and had to be manually rebooted. It came up cleanly after a thorough disk check. Total downtime was approximately 15 minutes.
  • [Dec 25, 2000, 9:50 am] vilya Downtime

    vilya (www60) crashed this morning due to heavy load, but could not automatically reboot. Because of this, the server was not able to be refreshed automatically. After a thorough disk cleaning, the machine was brought back up. Total downtime was approximately one hour.
  • [Dec 24, 2000, 1:52 pm] omega downtime

    omega (www16) crashed, and was brought back online. Total downtime was a little under ten minutes.
  • [Dec 22, 2000, 8:27 am] kodh downtime

    kodh (www90) crashed under high load. Total downtime less than 25 minutes.
  • [Dec 20, 2000, 3:45 pm] naam upgrade completed

    We have completed the drive upgrade on naam (www114). It now has a 30.5GB hard drive. Total downtime was under 5 minutes.
  • [Dec 20, 2000, 10:58 am] naam upgrade

    We have begun a drive upgrade on naam (www114) in order to improve performance and storage. There will be two brief periods of downtime, each approximately five minutes, at the beginning and end of the maintenance. The server will remain online at all other times. The entire upgrade could take up to 24 hours to complete. We will post an additional notice when the upgrade has been completed.
  • [Dec 20, 2000, 3:53 am] gao downtime

    gao (www111) crashed under high load and had to be rebooted. Total downtime 5 minutes.
  • [Dec 19, 2000, 1:05 pm] neter Downtime

    neter (www132) crashed and was brought back online with downtime less than 15 minutes.
  • [Dec 18, 2000, 10:12 pm] theta Upgrade

    The upgraded theta had problems with its Ethernet card after only a few minutes online. After swapping the card and cable, theta is now back to normal operation. We apologize for the additional downtime and inconvenience.
  • [Dec 18, 2000, 9:16 pm] theta Upgrade - Update

    theta (www4) suffered some complications with the current upgrade. It will be down for the next 20 mintues while we fix the outstanding issues.
  • [Dec 18, 2000, 8:45 pm] theta upgrade completed

    The work on theta (www4) has been completed. This server is upgraded to a P-III 866 with 256 MB RAM and a 30.7 GB hard drive.
  • [Dec 18, 2000, 6:37 pm] theta upgrade

    theta (www4) will be taken down briefly to complete a systems upgrade. Downtime should be no more than 10 minutes.
  • [Dec 17, 2000, 12:16 am] bemnet Downtime

    bemnet crashed under heavy load and was brought back online after a filesystem check. Downtime was approximately fifteen minutes.
  • [Dec 14, 2000, 3:58 am] db12 Hardware replacement

    We have determined that the hard drive in db12 is dying, with unrecoverable errors. We are performing an emergency drive swap; the server will be offline for up to 60 minutes while this is done. No customer data will be lost.
  • [Dec 14, 2000, 3:30 am] sowilu Downtime

    sowilu (www145) was down early this morning for about 10 minutes. It has since returned to normal operation.
  • [Dec 13, 2000, 5:00 pm] fehu Downtime

    fehu (www134) crashed under heavy load. Downtime was 10 minutes.
  • [Dec 13, 2000, 7:43 am] quan downtime

    quan (www120) crashed under high load and rebooted itself. Total downtime, less than 10 minutes.
  • [Dec 12, 2000, 11:12 pm] xrra Downtime

    xrra (www116) crashed under heavy load. Downtime was 15 minutes.
  • [Dec 12, 2000, 9:05 pm] FrontPage Extensions

    In the process of correcting an obscure problem with Apache 1.3.14 under FreeBSD 4.1.1-STABLE earlier today, we inadvertently broke the FrontPage extensions for some customer sites. This capability is being restored at this time, and all FrontPage extensions on -STABLE servers should be in good working order within the next 30 minutes.

    Please accept our apologies for the inconvenience. The other problem that was being corrected was a problem with access to large PDF files from certain versions of Internet Explorer.

  • [Dec 12, 2000, 1:41 pm] jarre Downtime

    jarre (www172) crashed under load and was rebooted. Downtime was less than 15 minutes.
  • [Dec 8, 2000, 6:24 pm] kodh Downtime

    kodh (www90) crashed under heavy load. Downtime was 20 minutes.
  • [Dec 8, 2000, 3:55 am] halla Upgrade Completed

    The drive swap of halla (www68) has completed. It now has 20.5 GB of total disk space.
  • [Dec 7, 2000, 11:42 pm] halla Upgrade Failure

    During the drive swap on halla (www68), the drive that was suspected of having data errors has failed. We are in the process of rebuilding the server, and expect it to return to normal service within the next hour.
  • [Dec 7, 2000, 10:38 pm] halla upgrade

    We have begun a drive upgrade on halla (www68) in order to improve storage. There will be two brief periods of downtime, each approximately five minutes, at the beginning and end of the maintenance. The server will remain online at all other times. The entire upgrade could take up to 24 hours to complete. We will post an additional notice when the upgrade has been completed.
  • [Dec 7, 2000, 2:44 am] wunjo Downtime

    wunjo (www140) crashed due to high load and had to be rebooted. Total downtime under 5 minutes.
  • [Dec 6, 2000, 5:18 pm] theta upgrade

    We have begun a drive and system upgrade on theta (www4) in order to improve performance and storage. There will be two brief periods of downtime, each approximately five minutes, at the beginning and end of the maintenance. The server will remain online at all other times. The entire upgrade could take up to 24 hours to complete. We will post an additional notice when the upgrade has been completed.
  • [Dec 6, 2000, 11:13 am] pi downtime

    pi (www10) crashed under heavy load, and was brought back online. Downtime was around 15 minutes.
  • [Dec 5, 2000, 11:17 am] bemnet downtime

    bemnet (www158) crashed under heavy load, and automatically rebooted after a thorough disk cleaning. Total downtime was approximately 15 minutes.
  • [Dec 4, 2000, 11:46 pm] halla upgrade

    halla (www68) was upgraded to an Athlon Thunderbird 1000 MHz, with 256 MB of RAM. Downtime was under five minutes.
  • [Dec 4, 2000, 11:29 pm] yanta upgrade

    yanta (www67) was upgraded to an Athlon Thunderbird 1000 MHz, with 256 MB of RAM. Downtime was under five minutes.
  • [Dec 4, 2000, 11:12 pm] hyarmen upgrade

    hyarmen (www66) was upgraded to an Athlon Thunderbird 1000 MHz, with 256 MB of RAM. Downtime was under five minutes.
  • [Dec 4, 2000, 10:48 pm] lambe upgrade

    lambe (www63) was upgraded to an Athlon Thunderbird 1000 MHz, with 256 MB of RAM. Downtime was under five minutes.
  • [Dec 4, 2000, 10:25 pm] anga upgrade

    anga (www47) was upgraded to an Athlon Thunderbird 1000 MHz, with 256 MB of RAM. Downtime was under five minutes.
  • [Dec 2, 2000, 5:02 pm] derba downtime

    derba (www104) crashed under heavy load. It has been returned to normal service with a downtime of approximately 5 minutes.
  • [Dec 2, 2000, 4:50 pm] aster downtime

    aster (www96) crashed under heavy load, and required an extensive manual filesystem cleaning before being returned into service. Downtime was approximately 15 minutes.
  • [Dec 1, 2000, 9:42 pm] chi downtime

    chi (www14) crashed under heavy load and had to be rebooted. Total downtime was under 5 minutes.
  • [Dec 1, 2000, 9:31 pm] Power Distribution Problem

    A power distribution failure in one datacenter rack led to brief downtime for sasi (www51), pyyl (www97), zhun (www107), hwesta (www207), and db15. A slightly loose connection on an industrial power strip led to arcing and scoring of the plug; the problem has been rectified, and the affected servers were out of service for less than ten minutes each.

    No power protection system is perfect; this low-level problem in the final stage of power distribution is difficult to avoid. Our primary power systems, including transformers, TVSS, UPS, ATS, and genset were not involved in this problem. We do not expect any likely recurrence of the problem. Please accept our apologies for the brief interruption of service.

  • [Dec 1, 2000, 12:23 pm] eeoth upgrade completed

    The upgrade of eeoth (www85) has been completed. The server is now a Pentium III, 866 MHz with 256 MB of RAM and 30 GB of space. Downtime was under ten minutes.
  • [Dec 1, 2000, 11:57 am] reit upgrade completed

    The upgrade of reit (www108) has been completed. The server now has 30 GB of total disk space. Downtime was under five minutes.
  • [Dec 1, 2000, 5:58 am] haebrath downtime

    haebrath (www156) crashed under high load and had to be rebooted. Total downtime, less than 10 minutes.
  • [Nov 30, 2000, 10:51 pm] reit upgrade

    We have begun a drive upgrade on reit (www108) in order to improve storage. There will be two brief periods of downtime, each approximately five minutes, at the beginning and end of the maintenance. The server will remain online at all other times. The entire upgrade could take up to 24 hours to complete. We will post an additional notice when the upgrade has been completed.
  • [Nov 30, 2000, 10:32 pm] eeoth upgrade

    We have begun a drive and system upgrade on eeoth (www85) in order to improve performance and storage. There will be two brief periods of downtime, each approximately five minutes, at the beginning and end of the maintenance. The server will remain online at all other times. The entire upgrade could take up to 24 hours to complete. We will post an additional notice when the upgrade has been completed.
  • [Nov 30, 2000, 10:04 am] anca maintenance

    The hardware on anca (www53) has been swapped, following problems after its recent upgrade. This is expected to prevent any further problems. Downtime was under five minutes.
  • [Nov 30, 2000, 9:03 am] PairList Upgrade Planned

    We are pleased to announce the planned upgrade of our PairList service to Mailman 2.0. This major upgrade will take place on Monday, December 3, and is outlined at http://www.pairlist.net/upgrade.shtml
  • [Nov 30, 2000, 2:11 am] anca downtime

    anca (www53) crashed once more and was brought back on line. We continue to investigating the cause of the recent crashes. Total downtime was approximately 15 minutes.
  • [Nov 30, 2000, 12:22 am] anca Downtime

    anca (www53) crashed and was brought back online after a filesystem cleaning. We are currently investigating the cause of recent crashes on this server.
  • [Nov 29, 2000, 9:11 pm] uilen Downtime

    uilen (www35) crashed under load and required an extensive manual filesystem cleaning before being returned to service. Downtime was approximately 40 minutes.
  • [Nov 29, 2000, 4:23 pm] PHP3 Magic Quotes in -STABLE

    Currently, the magic_quotes_gpc feature in PHP 3 under FreeBSD -STABLE is set to 'on'. However, based on user feedback, we have decided to restore the previous setting of 'off'. Because PHP 3 still runs as a CGI, there is no way for user scripts to change this setting. Therefore it is better to leave it unchanged by the -STABLE upgrade.

    The new setting will be deployed at 8am Eastern time on Thursday, November 30. If you have any PHP 3 scripts running on -STABLE servers which depend on this setting (which was only introduced with the -STABLE upgrade), please correct them at that time.

    Please note that the settings may be tuned for PHP 4, as it runs as an Apache module. In general, we encourage our users to upgrade their code to use PHP 4 wherever possible.

  • [Nov 29, 2000, 3:44 pm] ulwar upgrade completed

    The hard drive upgrade on ulwar (www82) has been completed. Due to complications, the server's hardware upgrade has been postponed. Downtime was under ten minutes.
  • [Nov 29, 2000, 12:54 pm] umbar upgrade completed

    The upgrade of umbar (www46) has been completed. The server now has 30 GB of total disk space. Downtime was under five minutes.
  • [Nov 29, 2000, 12:03 pm] anca downtime

    anca (www53) crashed under heavy load and had to be manually rebooted. Total downtime was under 15 minutes.
  • [Nov 29, 2000, 9:01 am] anca Downtime

    anca (www53) crashed and was brought back online with downtime of approximately 20 minutes.
  • [Nov 28, 2000, 11:55 pm] ulwar upgrade

    We have begun a drive and system upgrade ulwar (www82). There will be two brief periods of downtime, each approximately five minutes, at the beginning and end of the maintenance. The server will remain online at all other times. The entire upgrade could take up to 24 hours to complete. We will post an additional notice when the upgrade has been completed.
  • [Nov 28, 2000, 11:32 pm] umbar upgrade

    We have begun a drive upgrade on umbar (www46) to improve storage. There will be two brief periods of downtime, each approximately five minutes, at the beginning and end of the maintenance. The server will remain online at all other times. The entire upgrade could take up to 24 hours to complete. We will post an additional notice when the upgrade has been completed.
  • [Nov 28, 2000, 5:51 pm] STABLE Upgrades

    The -STABLE upgrades that took place on maborym (www204) and orvni (www205) today encountered problems which led to several additional brief intervals of downtime for each server. A workaround was developed, and the upgrades have been completed normally for both servers. The problem is specifically related to the large drives being used (30GB). Please accept our apologies for the downtime; we will ensure this does not recur in future upgrades.
  • [Nov 28, 2000, 3:32 pm] maborym upgrade

    maborym (www204) is currently undergoing upgrade procedures to 4.1-STABLE. Due to some difficulties with this process, we expect this server to be down for another 10 minutes while this is completed. Normal service should return shortly afterwards.
  • [Nov 28, 2000, 1:08 pm] omega upgrade completed

    The upgrade of omega (www16) has been completed. The server now features 30 GB of total space, and is an Athlon Thunderbird 1000 MHz, with 256 MB of RAM. Downtime was under ten minutes.
  • [Nov 28, 2000, 10:41 am] anca upgrade completed

    The upgrade of anca (www53) has been completed. The server now features 30 GB of total space, and is an Athlon Thunderbird 1000 MHz, with 256 MB of RAM. Downtime was under five minutes.
  • [Nov 27, 2000, 11:47 pm] hwesta upgrade

    hwesta (www51) was upgraded to an Athlon Thunderbird 1000 MHz, with 256 MB of RAM. Downtime was under five minutes.
  • [Nov 27, 2000, 11:21 pm] umbar upgrade

    umbar (www46) was upgraded to an Athlon Thunderbird 1000 MHz, with 256 MB of RAM. Downtime was under ten minutes.
  • [Nov 27, 2000, 10:54 pm] anca upgrade

    We have begun a drive and system upgrade on anca (www53). There will be two brief periods of downtime, each approximately five minutes, at the beginning and end of the maintenance. The server will remain online at all other times. The entire upgrade could take up to 24 hours to complete. We will post an additional notice when the upgrade has been completed.
  • [Nov 27, 2000, 10:43 pm] ando upgrade

    ando (www45) was upgraded to an Athlon Thunderbird 1000 MHz, with 256 MB of RAM. Downtime was under five minutes.
  • [Nov 27, 2000, 10:18 pm] onn upgrade

    onn (www29) has been upgraded to an Athlon Thunderbird 1000MHz, with 256 MB of RAM. Downtime was under five minutes.
  • [Nov 27, 2000, 10:13 pm] omega upgrade

    We have begun a drive and system upgrade on omega (www16) in order to improve performance and storage. There will be two brief periods of downtime, each approximately five minutes, at the beginning and end of the maintenance. The server will remain online at all other times. The entire upgrade could take up to 24 hours to complete. We will post an additional notice when the upgrade has been completed.
  • [Nov 25, 2000, 11:41 pm] Network Problems Readdressed

    A recurrence of the same internal router failure we experienced Saturday night led to to a brief interruption of connectivity for a few customers. The total duration of the outage was less than five minutes. We will be shifting traffic away from the affected router within the next 24 hours, and expect no further disruptions of traffic.
  • [Nov 24, 2000, 11:17 pm] Network Problems Resolved

    The problem causing our internal routing issue has been found and corrected. All routing is behaving normally again. This issue did not affect all of our customers, and we apologize sincerly to the customers who were affected during this issue.
  • [Nov 24, 2000, 10:45 pm] Network Problems

    We are currently experiencing an internal routing problem, which may prevent some of our customers from reaching their sites. We are working diligently to investigate and remedy this issue, and hope to have the problem solved shortly.
  • [Nov 22, 2000, 10:54 am] or upgrade completed

    The upgrade on or (www34) has been completed. The server now has 30 GB of disk space. Downtime was under five minutes.
  • [Nov 21, 2000, 11:09 pm] or upgrade

    We have begun a drive and system upgrade on or (www34) in order to improve performance and storage. There will be two brief periods of downtime, each approximately five minutes, at the beginning and end of the maintenance. The server will remain online at all other times. The entire upgrade could take up to 24 hours to complete. We will post an additional notice when the upgrade has been completed.
  • [Nov 16, 2000, 6:04 pm] anca Downtime

    anca (www53) crashed under heavy load. Downtime was under 10 minutes.
  • [Nov 16, 2000, 4:58 pm] gebo downtime

    gebo (www139) crashed under heavy load, and was rebooted after a thorough disk cleaning. Total downtime was approximately 10 minutes.
  • [Nov 16, 2000, 4:44 pm] pyyl upgrade completed

    The upgrade of pyyl (www97) has been completed. The server now has 30 GB of space and is a Pentium III 866 MHz, with 256 MB RAM. Downtime was around ten minutes.
  • [Nov 16, 2000, 5:57 am] vuae downtime

    vuae (www93) crashed under high load and had to be rebooted. Total downtime under 5 minutes.
  • [Nov 15, 2000, 8:53 pm] pyyl upgrade

    We have begun a drive and system upgrade on pyyl (www97) in order to improve performance and storage. There will be two brief periods of downtime, each approximately five minutes, at the beginning and end of the maintenance. The server will remain online at all other times. The entire upgrade could take up to 24 hours to complete. We will post an additional notice when the upgrade has been completed.
  • [Nov 15, 2000, 6:30 pm] zhun upgrade

    We have completed the upgrade of zhun (www107). It is now an Athlon Thunderbird 1000 MHz, with 256MB RAM, and 30.5GB of disk space.
  • [Nov 15, 2000, 3:43 pm] STABLE Upgrades

    Due to a scheduling error, the servers intended for the -STABLE upgrade on Thursday, November 16th were in fact upgraded today, November 15th. We have therefore rescheduled Wednesday's servers for Thursday. Please accept our apologies for the mixup; it should not happen again.

    We have addressed every outstanding problem we are aware of; if you encounter any problems as a result of the -STABLE upgrade, please e-mail urgent@pair.com The upgrade notice in the Support Forum has been slightly updated: http://support.pair.com/notices/freebsd35.html

  • [Nov 15, 2000, 12:44 pm] zhun upgrade

    We are about to begin a drive upgrade of zhun (www107). There will be two short periods of downtime, each approximately five minutes, at the beginning and end of the upgrade. The server will remain online at all other times. We will post an additional notice when the upgrade has been completed.
  • [Nov 14, 2000, 5:57 pm] vilya downtime

    vilya (www60) crashed under heavy load, and was brought back online. Downtime was around 15 minutes.
  • [Nov 14, 2000, 3:05 pm] emancholl downtime

    emancholl (www37) crashed, and was rebooted. Full service has been restored with less than 10 minutes of downtime.
  • [Nov 13, 2000, 4:39 pm] STABLE Upgrades

    Three servers were upgraded to FreeBSD -STABLE today, per the posted schedule. The servers are vsael (www173), biont (www174), and zulle (www175). The schedule has been updated to indicate the upgrades planned for later this week.

    We are not aware of any outstanding problems that are not documented at http://support.pair.com/notices/freebsd35.html - if you encounter any difficulty with your site after this upgrade, please read that page. If you require assistance, please write to urgent@pair.com and we will address the problem promptly.

    The schedule is available at http://support.pair.com/notices/stable-upgrade.html

  • [Nov 13, 2000, 1:31 pm] kodh downtime

    kodh (www90) crashed under heavy load, and was brought back online. Downtime was under 15 minutes.
  • [Nov 13, 2000, 12:06 pm] ynilo downtime

    ynilo (www166) crashed under heavy load, and was brought back online. Downtime was under 10 minutes.
  • [Nov 12, 2000, 7:31 pm] kodh downtime

    kodh (www90) crashed under high load, and was brought back up after a manual check. Total downtime was about 15 minutes.
  • [Nov 10, 2000, 4:15 pm] bemnet downtime

    bemnet (www158) crashed under heavy load. After a thorough disk cleaning, it was brought back up. Total downtime was approximately 10 minutes.
  • [Nov 10, 2000, 1:41 pm] rho upgrade completed

    The maintenance on rho (www11) has been completed. The server now has 20 GB of disk space, and at the same time was upgraded to a Pentium III, 866 MHZ with 256 MB of RAM. Downtime was around 15 minutes.
  • [Nov 10, 2000, 11:27 am] auma upgrade completed

    The upgrade on auma (www86) has been completed. The server now has 20 GB of space. Downtime was under five minutes.
  • [Nov 10, 2000, 5:37 am] kodh downtime

    kodh (www90) crashed under high load and had to be rebooted
  • [Nov 10, 2000, 3:37 am] sether downtime

    sether (www95) crashed under high load and had to be rebooted. Total downtime was little over 5 minutes.
  • [Nov 9, 2000, 11:08 pm] rho upgrade

    We have begun a drive and system upgrade on rho (www11)in order to improve performance and storage. There will be two brief periods of downtime, each approximately five minutes, at the beginning and end of the maintenance. The server will remain online at all other times. The entire upgrade could take up to 24 hours to complete. We will post an additional notice when the upgrade has been completed.
  • [Nov 9, 2000, 10:44 pm] auma upgrade

    We are about to begin a drive upgrade of auma (www86). There will be two short periods of downtime, each approximately five minutes, at the beginning and end of the upgrade. The server will remain online at all other times. We will post an additional notice when the upgrade has been completed.
  • [Nov 9, 2000, 2:10 pm] FreeBSD -STABLE Upgrades

    Because we are still tracing an unusual bug that affects our ability to manage servers, we have rescheduled the -STABLE upgrades for Thursday, November 9 to Monday, November 13.

    The -STABLE upgrade unavoidably changes certain aspects of the services available to our customers. In several specific cases, customers will need to modify their usage in order to operate normally after the -STABLE upgrade. Please read about these important changes at http://support.pair.com/notices/freebsd35.html

  • [Nov 9, 2000, 11:33 am] thesel downtime

    thesel (www99) crashed under heavy load, and automatically rebooted. Total downtime was under ten minutes.
  • [Nov 9, 2000, 9:49 am] iota downtime

    iota (www5) crashed under heavy load, and was brought back online. Downtime was under 10 minutes.
  • [Nov 8, 2000, 10:05 pm] kodh upgrade

    We have just completed a drive and system upgrade of kodh (www90). It is now a 30GB drive in a Thunderbird with a 1Ghz processor and 256MB of RAM. The upgrade took longer than normal, but downtime was under 5 minutes.
  • [Nov 8, 2000, 9:09 pm] raitax Upgrade

    raitax (www171) has been upgraded to FreeBSD 4.1.1-STABLE. Please report any problems to support@pair.com. A complete schedule of future FreeBSD upgrades can be found at: http://support.pair.com/notices/stable-upgrade.html
  • [Nov 8, 2000, 3:43 pm] ilwe downtime

    ilwe (www81) crashed and was rebooted. Downtime was less than 10 minutes.
  • [Nov 8, 2000, 8:36 am] theta Reboot

    theta (www4) was rebooted. Downtime was less than 3 minutes.
  • [Nov 6, 2000, 10:17 pm] kodh upgrade

    We have begun a drive upgrade on kodh (www90) in order to improve performance and storage. There will be two brief periods of downtime, each approximately five minutes, at the beginning and end of the maintenance. The server will remain online at all other times. The entire upgrade could take up to 24 hours to complete. We will post an additional notice when the upgrade has been completed.
  • [Nov 6, 2000, 11:57 am] pi downtime

    pi (www10) encountered a problem which required a reboot. Downtime was under five minutes.
  • [Nov 3, 2000, 10:19 am] wawrra upgrade

    The hard drive upgrade of wawrra (www109) has been completed. The server now has 20 GB of space. Downtime was under five minutes.
  • [Nov 3, 2000, 12:05 am] kodh

    Due to unforseen complexities with the server, the upgrade of kodh has been pushed back for at least 24 hours. We are examining the situation and hope to push ahead with the upgrade this week.
  • [Nov 2, 2000, 11:10 pm] kodh upgrade

    We have begun a drive upgrade on kodh (www90) in order to improve performance and storage. There will be two brief periods of downtime, each approximately five minutes, at the beginning and end of the maintenance. The server will remain online at all other times. The entire upgrade could take up to 24 hours to complete. We will post an additional notice when the upgrade has been completed.
  • [Nov 2, 2000, 10:50 pm] wawrra upgrade

    We have begun a drive upgrade on wawrra (www109) in order to improve performance and storage. There will be two brief periods of downtime, each approximately five minutes, at the beginning and end of the maintenance. The server will remain online at all other times. The entire upgrade could take up to 24 hours to complete. We will post an additional notice when the upgrade has been completed.
  • [Nov 2, 2000, 10:12 am] fearn Downtime

    fearn (www40) crashed under load and was brought back online with downtime of approximately 15 minutes.
  • [Nov 1, 2000, 10:47 pm] beeoro upgrade

    We're about to begin a drive upgrade of beeoro (www69). There will be two brief periods of downtime, each approximately five minutes, at the beginning and end of the maintenance. The server will remain online at all other times. The entire upgrade could take up to 24 hours to complete. We will post an additional notice when the upgrade has been completed.
  • [Nov 1, 2000, 12:54 pm] paat upgrade

    paat (www100) received a hard drive upgrade, and now has 19 GB of space. Downtime was under five minutes.
  • [Nov 1, 2000, 12:35 pm] aedde upgrade

    aedde (www84) received a hard drive upgrade, and now has 14 GB space. Downtime was under five minutes.
  • [Oct 31, 2000, 9:36 pm] paat upgrade

    We are about to begin a drive upgrade of paat (www100). There will be two brief periods of downtime, each approximately five minutes, at the beginning and end of the maintenance. The server will remain online at all other times. The entire upgrade could take up to 24 hours to complete. We will post an additional notice when the upgrade has been completed.
  • [Oct 31, 2000, 8:58 pm] aedde upgrade

    We will begin a drive upgrade of aedde (www84) shortly. There will be two brief periods of downtime, at the beginning and end of the maintenance. The server will remain online at all other times. The entire upgrade could take up to 24 hours to complete. We will post an additional notice when the upgrade has been completed.
  • [Oct 31, 2000, 9:29 am] emancholl downtime

    emancholl (www37) crashed under heavy load and has since been rebooted. Downtime was about 10-15 minutes.
  • [Oct 30, 2000, 8:29 pm] thnad Downtime

    thnad (www121) was down this evening for approximately 10 minutes. It has since returned to normal operation.
  • [Oct 30, 2000, 5:34 pm] bemnet downtime

    bemnet (www158) encountered a network error which required the server to be restarted. Downtime was under five minutes.
  • [Oct 30, 2000, 3:02 pm] chi upgrade

    chi (www14) received a hard drive upgrade, and now has 30 GB total space. Downtime was under five minutes.
  • [Oct 30, 2000, 1:27 pm] onn upgrade

    onn (www29) received a hard drive upgrade, and now has 14 GB total disk space. Downtime was under five minutes.
  • [Oct 29, 2000, 10:13 pm] chi upgrade

    We have begun a drive and system upgrade on chi (www14). There will be two brief periods of downtime, each approximately five minutes, at the beginning and end of the maintenance. The server will remain online at all other times. The entire upgrade could take up to 24 hours to complete. We will post an additional notice when the upgrade has been completed.
  • [Oct 29, 2000, 9:06 pm] onn upgrade

    We have begun a drive and system upgrade on onn (www29) in order to improve performance and storage. There will be two brief periods of downtime, each approximately five minutes, at the beginning and end of the maintenance. The server will remain online at all other times. The entire upgrade could take up to 24 hours to complete. We will post an additional notice when the upgrade has been completed.
  • [Oct 29, 2000, 8:09 pm] gnaaste downtime

    gnaaste (www79) crashed under high load and had to be rebooted. Total downtime was under 10 minutes.
  • [Oct 29, 2000, 12:58 am] SAVVIS Update

    After slightly more than three hours, SAVVIS has restored the flow of traffic between our network and theirs. No explanation of the outage has yet been provided.
  • [Oct 28, 2000, 7:41 pm] SAVVIS Outage

    Shortly after 5pm Eastern time, our DS-3 circuit to SAVVIS stopped passing traffic. This may be a general outage for SAVVIS in either the Pittsburgh area or their New York City POP. We have requested escalation of this problem. In the meantime, customer traffic is unaffected; our other providers are carrying the traffic with no problems.

    We will post regarding any further developments.

  • [Oct 27, 2000, 12:38 pm] vala upgrade

    vala (www59) has been upgraded with a new hard drive, to have approx. 28 GB of space.
  • [Oct 26, 2000, 11:01 pm] vala upgrade

    We have begun a drive and system upgrade on vala (www59) in order to improve performance and storage. There will be two brief periods of downtime, each approximately five minutes, at the beginning and end of the maintenance. The server will remain online at all other times. The entire upgrade could take up to 24 hours to complete. We will post an additional notice when the upgrade has been completed.
  • [Oct 25, 2000, 10:43 pm] sowilu downtime

    sowilu (www145) crashed under heavy load and had to be rebooted. Downtime was under 15 minutes.
  • [Oct 25, 2000, 2:19 pm] kanat downtime

    kanat (www128) crashed under heavy load, and was rebooted. Downtime was under 10 minutes.
  • [Oct 25, 2000, 12:13 pm] kodh downtime

    kodh (www90) crashed under heavy load, and was rebooted after a thorough disk check. Total downtime was approximately 15 minutes.
  • [Oct 25, 2000, 11:51 am] iota upgrade

    iota (www5) has been upgraded to have 15 GB total disk space, and is now running as an AMD Athlon 1000 MHz, with 256 MB of RAM.
  • [Oct 24, 2000, 10:55 am] lepen downtime

    lepen (www129) crashed under heavy load, and was automatically rebooted. Total downtime was under 10 minutes.
  • [Oct 20, 2000, 9:25 pm] elhaz downtime

    elhaz (www144) became unresponsive under heavy load and was rebooted. Downtime was less than 10 minutes.
  • [Oct 19, 2000, 10:59 am] anca downtime

    anca (www53) crashed under heavy load and as since been rebooted. Downtime was less than 15 minutes.
  • [Oct 19, 2000, 10:05 am] bemnet downtime

    bemnet (www158) crashed under heavy load. It was automatically rebooted after a thorough disk check. Total downtime was approximately 10 minutes.
  • [Oct 19, 2000, 4:47 am] uilen downtime

    uilen (www35) crashed under high load and had to be rebooted. Total downtime was under 20 minutes.
  • [Oct 18, 2000, 8:37 am] kayan downtime

    kayan (www133) crashed under high load and had to be rebooted. Total downtime was under 10 minutes.
  • [Oct 18, 2000, 2:30 am] bhoth downtime

    bhoth (www163) crashed under high load and self rebooted. Total downtime was under 10 minutes.
  • [Oct 17, 2000, 7:04 pm] kayan Downtime

    kayan (www133) crashed under heavy load. Downtime was 15 minutes.
  • [Oct 17, 2000, 6:53 pm] Sprint Update

    Our DS-3 circuit to Sprint has remained steady since noon today. We will be working with Sprint engineers in the next few days to resolve two minor outstanding issues with the circuit. Customer traffic should not be affected.
  • [Oct 17, 2000, 1:00 pm] kayan downtime

    kayan (www133) crashed under heavy load, and was brought back online. Downtime was around ten minutes.
  • [Oct 17, 2000, 11:50 am] Sprint Network Trouble

    Since approximately 9:45am Eastern time, we have been seeing packet errors and brief outages on our DS-3 circuit to Sprint. We have a ticket open with Sprint, and they are investigating the issue. Some of the potentially affected traffic has been shifted to Digex in the interim. We will post further information as it becomes available.
  • [Oct 17, 2000, 10:25 am] anca downtime

    anca (www53) crashed under heavy load. It was brought back online after an extensive disk check. Total downtime was approximately 15 minutes.
  • [Oct 14, 2000, 8:27 pm] anca Offline

    anca was offline for approximately fifteen minutes because of a network problem, which has now been corrected.
  • [Oct 13, 2000, 9:55 am] enso downtime

    enso (www153) crashed under heavy load, and was brought back online. Downtime was approximately ten minutes.
  • [Oct 11, 2000, 4:43 pm] gao downtime

    gao (www111) was rebooted to clear up a resource problem. Downtime was under five minutes.
  • [Oct 11, 2000, 12:16 pm] kirlian downtime

    kirlian (www110) crashed under heavy load. It has been brought back online after an extensive disk check. Downtime was approximately 20 minutes.
  • [Oct 10, 2000, 7:16 pm] straif upgrade

    straif (www26) has been upgraded to a 15gb hard drive, and to a Pentium III, 733MHz with 256MB of RAM.
  • [Oct 10, 2000, 7:01 pm] kayan Downtime

    kayan (www133) crashed under load. It was rebooted. Downtime was 10 minutes.
  • [Oct 10, 2000, 12:48 am] SAVVIS Resolution

    After an overall outage duration of six hours, our SAVVIS circuit was restored to normal operation around 11:05pm Eastern time. We will monitor for any further difficulties. We do not yet have details of the actual problem in their New York POP.
  • [Oct 9, 2000, 11:40 pm] straif upgrade

    We have begun a drive and system upgrade on straif (www26) in order to improve performance and storage. There will be two brief periods of downtime, each approximately five minutes, at the beginning and end of the maintenance. The server will remain online at all other times. The entire upgrade could take up to 24 hours to complete. We will post an additional notice when the upgrade has been completed.
  • [Oct 9, 2000, 11:24 pm] eeoth Downtime

    eeoth (www85) crashed due to heavy load. It was rebooted and brought back online. Downtime was 10 minutes.
  • [Oct 9, 2000, 10:22 pm] SAVVIS Update

    For the past hour, we have been successfully delivering traffic outbound via SAVVIS. However, SAVVIS is still not accepting our network routes, and consequently no traffic is flowing in on that circuit. Due to the fact that most of our inbound traffic flows through other providers, who are covering the traffic easily, this is not much of a problem. The lengthy delay in fixing their configuration is arguably cause for concern, however.

    It is likely that the circuit will go down at least once more while they fix this configuration. We will post further updates as news develops.

  • [Oct 9, 2000, 8:46 pm] SAVVIS Update

    SAVVIS had our circuit offline for nearly two and a half hours this evening, following a crash of their New York POP. After having some trouble locating our customer information, they are now working to rebuild the BGP session necessary to carry our traffic. The circuit has briefly been operational for outbound traffic only, but is now being reconfigured.

    Traffic has been carried by our other backbone providers without difficulty. We will post a further update when it becomes available.

  • [Oct 9, 2000, 5:56 pm] SAVVIS Trouble

    We are currently seeing high latency and packet loss on our circuit to SAVVIS. This may be affecting performance for a small percentage of our visitors; we are following up with SAVVIS to determine the expected duration of the problem.
  • [Oct 9, 2000, 1:03 pm] kayan downtime

    kayan (www133) crashed under heavy load, and was brought back online. Downtime was under 10 minutes.
  • [Oct 9, 2000, 1:08 am] theta downtime

    theta (www4) had to be rebooted n order to restart the respawn log of the server. Total downtime was less than 5 minutes.
  • [Oct 8, 2000, 6:28 pm] thurisaz Downtime

    thurisaz (www136) crashed and rebooted. Downtime was approximately 10 minutes.
  • [Oct 8, 2000, 3:31 am] fearn Downtime

    fearn has been offline for approximately one hour; we have traced the outage to a failed Ethernet controller. The server will be back online within the next ten minutes.
  • [Oct 7, 2000, 7:38 pm] derba Downtime

    derba (www104) crashed and rebooted. Downtime was 10 minutes.
  • [Oct 7, 2000, 8:14 am] bemnet Downtime

    bemnet crashed under heavy load, and was down for approximately twenty minutes during extensive filesystem cleaning.
  • [Oct 6, 2000, 2:34 pm] xi downtime

    xi (www8) rebooted, and was down for approximately 15 minutes for extensive automatic filesystem cleaning. All services have returned as of this time.
  • [Oct 6, 2000, 1:48 am] calma Restored

    calma has been restored to full service. The server was online throughout the file restoration process, in order to maximize the availability of sites as they were being restored. The backup is from Friday morning. Unfortunately, Web and FTP logs for all hits on Friday were lost as a result of the drive failure. Also, due to an unfortunate interaction between our backup system and pair2000, the contents of virtual mailboxes were not recoverable from backups. Because this server was converted to pair2000 just one week ago, however, there were very few mailboxes configured.

    We have corrected this interaction. No other customer data was lost. The server was offline for about 90 minutes, after which another 90 minutes was required to restore data from backups. In many ways, the pair2000 system simplified the data restoration process.

    The drive in calma was old and normally would have been replaced as part of routine hardware upgrades. We will continue to strive to stay ahead of these failures; currently we are dealing with parallel upgrades for pair2000, CPU/memory, larger/newer drives, and FreeBSD 3.5.

    We apologize to our customers for this inconvenient emergency.

  • [Oct 5, 2000, 11:42 pm] calma Update

    calma (www43) is in the process of getting the primary drive rebuilt. Expected time to completion is 2 hours. More information will be posted as needed.
  • [Oct 5, 2000, 10:55 pm] calma Drive Failure

    calma (www43) suffered a primary drive failure around 10:30 pm EST. We are in the process of rebuilding the drive. More information will be posted as it becomes available.
  • [Oct 5, 2000, 7:04 pm] kayan Downtime

    kayan (www133) crashed under heavy load. It has since been brought back online. Total downtime was approximately 10 minutes.
  • [Oct 5, 2000, 3:05 pm] omicron downtime.

    omicron (www9) crashed under heavy load. It has since been brought back online. Total downtime was under 15 minutes.
  • [Oct 5, 2000, 12:45 pm] bhoth downtime

    bhoth (www163) crashed under heavy load. It automatically rebooted, and came back up with no errors. Total downtime was under five minutes.
  • [Oct 4, 2000, 6:28 pm] neter maintenance completed

    The hard drive swap on neter (www132) has been completed. The server server now has 18 GB total space.
  • [Oct 4, 2000, 3:37 am] glikk downtime

    glikk (www119) crashed under high load and had to be rebooted. Total down time under ten minutes.
  • [Oct 3, 2000, 4:20 pm] neter maintenance

    neter will be taken down shortly for drive maintenance. Downtime should be under ten minutes. There will later be another short period of downtime when the drive swap is finalized.
  • [Oct 3, 2000, 12:04 am] shen upgrade

    shen (www88) was upgraded to a Pentium III, 733 MHz with 256 MB of RAM. Downtime was under ten minutes.
  • [Oct 2, 2000, 11:44 pm] mildh upgrade

    mildh (www87) was upgraded to a Pentium III, 733 MHz with 256 MB of RAM. Downtime was under ten minutes.
  • [Oct 2, 2000, 11:28 pm] auma upgrade

    auma (www86) was upgraded to a Pentium III, 733 MHz with 256 MB of RAM. Downtime was under ten minutes.
  • [Oct 2, 2000, 11:09 pm] nuumen Downtime

    nuumen (www55) was down for about 10 minutes tonight. It has since been brought back to normal service.
  • [Oct 2, 2000, 10:38 pm] cele upgrade

    cele (www73) has been upgraded to a Pentium III 733 MHz, with 256 MB of RAM. Downtime was under 15 minutes.
  • [Oct 2, 2000, 11:25 am] epsilon maintenance completed

    The hardware maintenance on epsilon (www3) has been completed. The hard drive has been upgraded to have 18 GB total space. Additionally it was upgraded to an AMD Athlon at 800 MHz, with 384 MB of RAM.
  • [Sep 29, 2000, 11:49 am] epsilon maintenance

    epsilon (www3) is being worked on at this time for a hard drive upgrade. There will be an additional brief period of downtime later for the final swap. We apologize for the inconvenience.
  • [Sep 28, 2000, 11:37 pm] flure upgrade

    flure (www75) was upgraded to a Pentium III, 733 MHz with 256 MB of RAM. Downtime was under 10 minutes.
  • [Sep 28, 2000, 11:13 pm] gwind upgrade

    gwind (www74) was upgraded to a Pentium III, 733 MHz with 256 MB of RAM. Downtime was under ten minutes.
  • [Sep 28, 2000, 10:53 pm] dyyme upgrade

    dyyme (www71) has been upgraded to a Pentium III 733 MHz, with 256 MB of RAM. Downtime was under ten minutes.
  • [Sep 28, 2000, 10:33 pm] tama upgrade

    tama (www70) has been upgraded to a Pentium III 733 MHz, with 256 MB of RAM.
  • [Sep 27, 2000, 6:01 pm] falku downtime

    falku (www98) crashed under heavy load, and was brought back online. Downtime was under ten minutes.
  • [Sep 27, 2000, 4:55 pm] eite maintenance completed

    The drive maintenance on eite has been completed. New drive capacity is now 15GB
  • [Sep 27, 2000, 3:30 pm] pair2000 Mail Delivery

    We have corrected a qmail configuration problem which was delaying e-mail delivery on a few of our pair2000-enabled servers. Without this fix, all e-mail delivery was being delayed for up to 30 minutes. We have implemented a system fix which will prevent the recurrence of this problem. Except under conditions of unusual mail volume, all mail delivery under pair2000 should now be nearly instantaneous.
  • [Sep 27, 2000, 1:32 pm] eite maintenance

    After detecting the beginnings of a problem with the hard drive in eite (www38), we are beginning a preemptive emergency drive swap of this server before any user data is lost. There will be two brief downtimes at the beginning and end of maintenance of approximately 5 minutes, while the hardware is replaced. The server will remain online at all other times throughout.

    We will post an additional notice when this maintenance is completed.

  • [Sep 27, 2000, 1:00 pm] onn downtime

    onn (www29) crashed under heavy load, and was brought back online. Downtime was under ten minutes.
  • [Sep 26, 2000, 5:02 pm] khurla upgrade completed

    The drive upgrade of khurla has been completed, with a new drive space of 15GB to better accomidate future growth.
  • [Sep 26, 2000, 4:36 pm] UUnet Resolution

    We are severely disappointed with UUnet's handling of our OC-3c upgrade, and grievously embarrassed by the timing of what should have caused no interruption of traffic whatsoever. The OC-3c circuit was fully tested and accepted by all parties involved, at every layer. IP traffic was passed successfully over it before the changeover.

    Unfortunately, when the changeover was initiated around 12:30pm, traffic refused to flow inbound from UUnet. This was resolved by renumbering the circuit, on the assumption that there was a conflict with some other customer circuit. At 2:45pm, UUnet again renumbered the circuit, in order to establish a more permanent address. At this point, our network was again blackholed by UUnet, a situation which persisted intermittently until approximately 4:15pm. During this time, customers and site visitors attempting to reach our network via UUnet were often unsuccessful.

    Upon reverting to the prior configuration, the problem was not resolved. At this point, UUnet was stumped, and called in additional engineers. After 90 minutes and another circuit configuration change, traffic was restored to normal, flowing in both directions on the OC-3c circuit. We do not know if changes were made elsewhere in UUnet's network.

    The DS-3 circuit will continue to act as a warm standby. We are pursuing remedy for this outage with UUnet, but recognize that nothing short of eliminating such incidents will truly protect our reputation for technical excellence and network uptime. As UUnet continues to be the only network with a history of blackholing our inbound traffic, for whatever technical reason, we expect to continue to focus our traffic growth on our other backbone providers, including our pending OC-3c circuit to Sprint as well as our new AT&T DS-3 service.

    pair Networks, Inc offers its most sincere apology to customers and site visitors affected by this partial outage. Uptime and reliability continue to be our top goals, and this type of outage should never occur again.

  • [Sep 26, 2000, 2:40 pm] thesel downtime

    thesel (www99) crashed under heavy load. Downtime was approximately 10 minutes.
  • [Sep 26, 2000, 1:57 pm] UUnet Resolution

    After an extensive debugging session with UUnet engineers, our OC-3c circuit is now up and successfully passing traffic in both directions. For approximately forty minutes, outbound traffic was working normally, while inbound traffic was being blackholed by UUnet. This is another manifestation of internal routing problems at UUnet, an issue we have repeatedly tried to address with their engineers.

    The original problem has not been resolved, but worked around. There will be one additional brief interruption of OC-3c connectivity while the interface is reconfigured. Traffic will be carried by the DS-3, as well as other backbones, in the interim.

  • [Sep 26, 2000, 12:57 pm] kirlian downtime

    kirlian (www110) crashed under heavy load, and was brought back online. Downtime was around 20 minutes.
  • [Sep 26, 2000, 12:47 pm] UUnet Upgrade

    We are continuing the process of upgrading our UUnet circuit from DS-3 to OC-3c. During the cutover, portions of UUnet's network are apparently blackholing traffic destined for our network. We are working with UUnet engineers on this urgent matter, and will post further details as they become available.
  • [Sep 26, 2000, 12:31 pm] khurla upgrade

    We have begun a drive upgrade on khurla (www112) in order to improve performance and storage. There will be two brief periods of downtime, each approximately five minutes, at the beginning and end of the maintenance. The server will remain online at all other times. The entire upgrade could take up to 24 hours to complete.

    We will post an additional notice when the upgrade has been completed.

  • [Sep 26, 2000, 3:48 am] yekk downtime

    yekk (www124) crashed under heavy load, and required an extensive filesystem cleaning. Downtime was around 30 minutes.
  • [Sep 25, 2000, 11:47 pm] pi upgrade

    pi (www10) has been upgraded to a Pentium III 733 MHz, with 256 MB of RAM. Total downtime was under 15 minutes.
  • [Sep 25, 2000, 11:35 pm] nuumen Downtime

    nuumen (www55) crashed under heavy load. Downtime was approximately 10 minutes.
  • [Sep 25, 2000, 10:44 pm] arda upgrade

    arda (www62) was upgraded to a Pentium III 733 MHz, with 256 MB of RAM. Downtime was under ten minutes.
  • [Sep 25, 2000, 10:29 pm] roomen upgrade

    roomen (www61) was upgraded to a Pentium III 733 MHz, with 256 MB of RAM. Downtime was under ten minutes.
  • [Sep 25, 2000, 4:43 pm] humph upgrade complete

    Maintenance on humph has been completed. This server is upgraded to a P-III 733MHz with 15GB of drive space.
  • [Sep 25, 2000, 4:05 pm] perthro downtime

    perthro (www143) crashed under heavy load and required a manual reboot. Total downtime was approximately 10 minutes.
  • [Sep 25, 2000, 12:33 pm] humph maintenance

    humph (118) will be brought down briefly today for maintenance and upgrades. Downtime should be less than 15 minutes.
  • [Sep 25, 2000, 12:08 pm] Networking Changes

    We are currently in the process of activating our long-awaited OC-3c circuit to UUnet. Our existing DS-3 will remain in place as an emergency failover circuit. During this changeover, no traffic should be lost, but some customers may experience intermittent latency as traffic shifts between circuits.

    We will post further details once this change has been completed.

  • [Sep 21, 2000, 7:05 pm] eter Downtime

    neter (www132) crashed requiring a reboot. Downtime was approximately 10 minutes.
  • [Sep 21, 2000, 1:12 pm] Genuity Upgrade

    We have completed an urgent upgrade to our Genuity circuit, which should alleviate congestion for customers using that link. We still have a pending order to rehome our circuit from Washington DC to Cleveland OH for improved performance; this should be complete within the next 30 days.
  • [Sep 20, 2000, 10:49 pm] nuin maintenance completed

    The upgrade for nuin has been completed. This server is currently alive as a P-III 733 with 15GB of drive space.
  • [Sep 20, 2000, 2:26 pm] bemnet downtime

    bemnet (www158) crashed under heavy load. It has since been rebooted after a thorough file system check. Downtime was under 15 minutes.
  • [Sep 20, 2000, 12:42 pm] nuin upgrade

    nuin will be brought down briefly today for scheduled maintenance in an upgrade of drive space and processor. Each downtime should be no more than 15 minutes. Details will follow when this maintenance upgrade is completed.
  • [Sep 19, 2000, 7:07 pm] Network problems

    One of our Cisco 7200's experienced a memory problem, causing errors with its CEF routing tables. We performed emergency maintainence, and it now properly routing traffic again. We expect performace to be stable, but are closely monitoring the router to be sure.
  • [Sep 19, 2000, 3:24 pm] Genuity Upgrade

    We are seeing some saturation on our DS-3 circuit to Genuity; an emergency upgrade is being implemented which should alleviate the problem. Ths upgrade will take effect Wednesday morning.
  • [Sep 19, 2000, 12:46 pm] bemnet downtime

    bemnet rebooted under heavy load, and returned after a file systems check with approximately 10 minutes of downtime.
  • [Sep 19, 2000, 11:33 am] sether downtime

    sether (www95) was rebooted to clear up an unrecoverable high load condition. Downtime was under two minutes.
  • [Sep 19, 2000, 6:12 am] coll Downtime

    coll (www22) crashed due to a mail loop and was rebooted. Downtime was approximately 10 minutes.
  • [Sep 18, 2000, 10:56 pm] vala upgrade

    vala (www59) has been upgraded to a Pentium III 733 MHz, with 256 MB of RAM. Downtime was under 10 minutes.
  • [Sep 18, 2000, 10:28 pm] noldo upgrade

    noldo (www56) has been upgraded to a Pentium III, 733 MHz server with 256 MB of RAM. Total downtime was under 10 minutes.
  • [Sep 18, 2000, 10:19 am] pi Downtime.

    pi (www10) crashed under heavy load. It has since been rebooted after a thorough file system check. Downtime was 20-25 minutes.
  • [Sep 17, 2000, 8:53 pm] ungwe Mail Delivery

    We have identified and corrected a problem with mail delivery on ungwe (www48) that caused some users' mail to be queued throughout the day instead of being delivered. No mail was lost, and at this time all queued mail has been distributed to the appropriate account.
  • [Sep 16, 2000, 2:43 pm] chi Downtime

    chi (www14) crashed under load and was brought back online with downtime of approximately 10 minutes.
  • [Sep 16, 2000, 9:54 am] epsilon Downtime

    epsilon (www3) crashed under load and was brought back online with downtime of approximately 10 minutes.
  • [Sep 15, 2000, 6:45 pm] derba Downtime

    derba (www104) suffered a crash due to heavy load. It was brought back online within 5 minutes.
  • [Sep 14, 2000, 1:21 pm] jarre downtime

    jarre halted under heavy load, and was rebooted. Downtime was less than 10 minutes.
  • [Sep 14, 2000, 9:06 am] eeoth Downtime

    eeoth crashed under heavy load, and has since been rebooted. Downtime was approx. 6-8 minutes.
  • [Sep 13, 2000, 3:47 am] Genuity Maintenance

    During the course of routine maintenance, Genuity's switch in Washington DC was reset, resulting in 3 minutes of downtime and about 15 minutes of latency to our Genuity DS-3 circuit. Their switch was brought back online with no complications, and traffic has resumed to normal.
  • [Sep 11, 2000, 9:45 pm] cele upgrade

    cele was brought down for approximately 10 minutes to complete an upgrade of its hard drive. This server now has a capacity of 16GB
  • [Sep 11, 2000, 3:55 pm] pyyl Downtime

    pyyl (www97) crashed under heavy load, and was brought back online with less than ten minutes downtime.
  • [Sep 11, 2000, 8:05 am] pair2000 Deployment

    Based on recent bug reports accumulated during the system maintenance period, we have elected to extend system maintenance for an additional week, in order to resolve known problems before expanding the pair2000 system to encompass additional users. Our conversion schedule will be resumed starting the next week.

    Servers scheduled for this week will be redispersed throughout the rest of the month's schedule.

  • [Sep 10, 2000, 8:03 pm] quan Downtime

    quan (www120) crashed multiple times in a row, and required an extensive filesystem check with each crash. As there was no sign of any load problem, we suspected a hardware problem such as flaky RAM or a failing motherboard. We have swapped the system to a new Athalon 700MHz, with 384MB SDRAM. The server appears to be acting normal again, and we do not expect to see any more stability problems with this server. Total downtime during the crashes and swap was a little over an hour.
  • [Sep 10, 2000, 12:50 am] UUnet Resolution

    Beginning around 10:15pm Eastern time this evening, our gateway router to UUnet went out of service. This was a repeat of the incident that was believed to be resolved at 6pm. The problem was traced to a Gigabit Ethernet interface present in the router. Once disabled, the router promptly returned to normal service. For approximately thirty minutes, some traffic that normally passes through UUnet was delayed or dropped by the malfunctioning router.

    Needless to say, we continue to be disappointed with the behavior of Gigabit Ethernet on the Cisco 7507 platform. This problem was triggered by the appearance of Gigabit traffic from our Black Diamond switches; the switches do not appear to have been culpable. As recently indicated in the Insider Newsletter, we have committed to the Juniper routing platform for future expansion; the Juniper happens to be well-known for wire-speed Gigabit Ethernet performance and carrier-class reliability.

    We offer our sincere apology to any customer affected by this incident. We will continue to seek to improve reliability, performance, and redundancy as our network expands.

  • [Sep 9, 2000, 10:53 pm] UUNet Gateway Maintenance

    We are currently conducting emergency maintenance on our UUNet Gateway. We apologize for any inconvenience our customers experience during this maintenace, and we hope to have it resolved shortly.
  • [Sep 9, 2000, 6:09 pm] quan Downtime

    quan crashed under heavy load, and was brought back online with less than ten minutes downtime.
  • [Sep 9, 2000, 5:46 pm] UUnet Gateway Failure

    During testing of our Black Diamond switch configuration earlier today, an unexplained interaction with the Cisco router serving our gateway to UUnet led to degraded resources which ultimately resulted in poor processing by that router. During the day, a limited set of routes were unavailable; overall traffic levels were not reduced noticeably, but there was at least one specific customer complaint through which the problem was identified.

    While working with the degraded router, it crashed and was brought back online manually. It is now serving our UUnet traffic successfully, without any incorrect routes, and will be monitored closely for further resource problems. The Black Diamond switches have been disconnected from our LAN, pending an investigation of the origin of this interaction.

    We apologize for the inconvenience to any customer who had difficulty or delays in reaching our network during this period, including the UUnet gateway outage of approximately fifteen minutes, during which traffic was carried by alternate providers. We are working carefully on the switch integration to ensure that there are no interactions such as what was experienced today. We shall remain careful and cautious.

  • [Sep 7, 2000, 4:05 am] zatz Downtime

    zatz (www122) crashed under heavy load. It was brought back online within 5 minutes.
  • [Sep 5, 2000, 5:57 pm] omega downtime

    omega was rebooted, and given an extensive filesystem check. It has returned to service with less than 10 minutes downtime
  • [Sep 3, 2000, 3:38 am] kodh downtime

    kodh (www90) crashed under heavy load, and had to be brought back up. Total downtime was 20 minutes.
  • [Aug 31, 2000, 6:44 pm] pair2000 Mail Delivery

    Mail delivery in the pair2000 system has been enhanced to provide the X-Envelope-To field that some of our users have come to rely on. This is a nonstandard field that our sendmail-based system currently provides on most mail deliveries. We have also corrected a glitch that was causing the incorrect envelope sender to be included on some mail delivered to virtual mailboxes.

    For more information on pair2000, please visit http://www.pair2000.com/.

  • [Aug 29, 2000, 10:08 am] unque Upgraded

    As a result of the Ethernet upgrade on unque, the server is now a Pentium III at 733 MHz, with 256MB RAM. This emergency maintenance has caused unque's pair2000 upgrade to be rescheduled for tomorrow.
  • [Aug 29, 2000, 9:25 am] unque Emergency Maintenance

    The Ethernet card on unque is failing, causing severe traffic problems on that server. It is being replaced on an emergency basis, and will be out of service for no more than ten minutes.
  • [Aug 28, 2000, 11:32 am] tiwaz downtime

    tiwaz (www146) crashed under heavy load, and was brought back online. Downtime was around 10 minutes.
  • [Aug 28, 2000, 11:11 am] pi downtime

    pi (www10) crashed under heavy load. After an extensive filesystem cleaning it was brought back up, with downtime around 20 minutes.
  • [Aug 26, 2000, 12:30 pm] omicron Downtime

    omicron (www9) crashed under load, and was brought back online with downtime less than 15 minutes.
  • [Aug 25, 2000, 5:20 am] cele Downtime

    cele crashed under heavy load and was rebooted. Total downtime was approximately ten minutes.
  • [Aug 24, 2000, 5:41 pm] neled downtime

    neled (www127) crashed under heavy load, and was brought back online with downtime under ten minutes.
  • [Aug 24, 2000, 2:32 pm] shen downtime

    shen (www88) crashed under high load and had to be rebooted. Total downtime less than 15 minutes.
  • [Aug 22, 2000, 4:03 am] ingwaz downtime

    ingwaz (www149) crashed under high load and had to be rebooted. Total downtime less than 15 minutes.
  • [Aug 19, 2000, 5:04 pm] anca Downtime

    anca (www53) ran out of swap space, requiring a reboot. Downtime was 15 minutes.
  • [Aug 18, 2000, 7:11 pm] dair Downtime

    dair (www20) was down for approximately 10 minutes to clear up a network problem. It has since returned to normal operation.
  • [Aug 18, 2000, 3:03 am] Genuity Routing

    Based on customer feedback, we are making some adjustments to routing decisions in our network, with respect to the new Genuity DS-3. As of 3am, we have lowered the inbound preference for this line, shifting the majority of our inbound traffic towards other providers. Adjustments to outbound traffic will be made within the next 10 days, based on an ongoing detailed analysis of traffic flow performance. This should result in optimal performance for all Internet destinations, based on the routes available to us from our five backbone providers.
  • [Aug 18, 2000, 2:43 am] ilceille Resolution

    The network interface failure on ilceille was traced to obscure corrupted configuration data. The error has been corrected and the server returned to normal service. Total downtime for most domains on this server was two hours.

    This was one of the most unusual problems we've seen. After exhausting what we felt were all reasonable software solutions, the cabling and Ethernet card were swapped out. When that failed, an in-depth study was conducted to finally identify the cause. We do not expect this to recur.

  • [Aug 18, 2000, 1:41 am] iceille Downtime

    iceille (www189) is experiencing network interface problems. We are working to correct the problem and expect it to be up shortly.
  • [Aug 17, 2000, 4:07 pm] ipre downtime

    ipre crashed due to heavy load, and has been brought back online. Total downtime was less than 15 minutes.
  • [Aug 17, 2000, 3:45 pm] emancholl downtime

    emancholl (www37) crashed under heavy load, and was brought back online with downtime around 10 minutes.
  • [Aug 16, 2000, 11:01 am] epsilon downtime

    epsilon (www3) crashed today under heavy load. It was able to reboot automatically. Total downtime was under five minutes.
  • [Aug 16, 2000, 3:35 am] thesel downtime

    thesel (www90) crashed under high load and had to be rebooted. total downtime approx. 20 minutes.
  • [Aug 15, 2000, 4:38 pm] bemnet downtime

    bemnet (www158) crashed under heavy load, and was brought back online with downtime under ten minutes.
  • [Aug 15, 2000, 3:23 pm] cele downtime

    cele (www73) crashed under heavy load, and was brought back online with downtime around 10 minutes.
  • [Aug 14, 2000, 1:40 pm] sowilu downtime

    sowilu (www145) crashed under heavy load, and was brought back online with downtime under 10 minutes.
  • [Aug 14, 2000, 9:20 am] thuule downtime

    thuule (www49) crashed under high load this morning at 9:10am and had to be rebooted. Total downtime, 9 minutes.
  • [Aug 14, 2000, 3:24 am] ampa upgrade

    ampa (www52) was upgraded to a Pentium III, 733MHZ with 256MB of RAM. Downtime was 5 minutes.
  • [Aug 14, 2000, 3:00 am] thuule upgrade

    thuule (www49) was upgraded to a Pentium III, 667 MHZ with 256MB of RAM. Total downtime for this maintance was 10 minutes.
  • [Aug 14, 2000, 2:35 am] quesse upgrade

    quesse (www44) has been upgraded to a Pentium III 667 MHZ with 255MB of RAM. Total downtime was 5 minutes.
  • [Aug 11, 2000, 9:59 pm] Digex/Sprint Connectivity Issue

    Digex is currently experiencing a connectivity problem with Sprint. This outage appears to be only affecting customer within home.net, as Digex to Sprint is the return path from pair. They are aware of the problem and expect resolution within 2 hours. More information to be posted as it becomes available.
  • [Aug 11, 2000, 4:57 pm] calma Downtime

    calma (www43) crashed under heavy load and had to be rebooted. Total downtime was under five minutes.
  • [Aug 11, 2000, 12:43 pm] pi Downtime

    pi (www10) crashed under heavy load and had to be rebooted. A manual disk clean was needed. Total downtime was about 20 minutes.
  • [Aug 11, 2000, 2:17 am] bhoth downtime

    bhoth(www163) crashed under high load and had to be rebooted. Downtime less than five minutes.
  • [Aug 11, 2000, 1:23 am] aaze upgrade

    aaze (www65) was upgraded to a Pentium III 733MHz, with 256MB of RAM. Total downtime was about 10 minutes.
  • [Aug 11, 2000, 12:54 am] ungwe upgrade

    ungwe (www48) was upgraded to a Pentium III 733MHz with 256MB of RAM. Total downtime was about 5 minutes.
  • [Aug 11, 2000, 12:18 am] nwalme upgrade

    nwalme (www57) was upgraded to a Pentium III-667 MHz, with 256 MB of RAM. Downtime was under five minutes.
  • [Aug 10, 2000, 11:50 pm] calma upgrade

    calma (www43) has been upgraded to a Pentium III-733 MHz, with 256 MB of RAM. Downtime was approximately five minutes.
  • [Aug 10, 2000, 11:12 pm] cuzea upgrade

    cuzea (www94) was upgraded to a Pentium III, 733 MHz with 256 MB of RAM. Downtime was under five minutes.
  • [Aug 10, 2000, 10:53 pm] silme upgrade

    silme (www64) was upgraded to a Pentium III 733 MHz with 256MB of RAM. Total downtime was about 10 minutes.
  • [Aug 10, 2000, 10:08 pm] parma upgrade

    parma (www42) was upgraded to a Pentium III, 733 MHz with 256MB of RAM. Total downtime was about five minutes.
  • [Aug 9, 2000, 10:29 am] ingwaz downtime

    ingwaz (www149) crashed due to a mailing loop, and came back up smoothly. Downtime was around five minutes.
  • [Aug 9, 2000, 4:20 am] anca downtime

    anca(www53) crashed under high load at around 3:50 Eastern, abd had to be rebooted. Total downtime, 26 minutes.
  • [Aug 9, 2000, 2:17 am] vilya downtime

    vilya(www160) was temporarily disabled due to a mailing loop at around 2:00am Eastern. Downtime lasted approx. 5 minute.
  • [Aug 8, 2000, 5:03 pm] Genuity Activation

    Our DS-3 circuit to Genuity is fully in service and running smoothly. Several customers are reporting performance improvements. Overall, less than ten percent of our traffic is being carried by Genuity, but for that ten percent, there should generally be an improvement.
  • [Aug 8, 2000, 12:26 pm] Genuity Activation

    We are in the process of turning up our DS-3 connectivity to Genuity at this time. Traffic will be shifted between providers several times, but there should be no significant ill effects. Performance will improve for certain destinations.

    We will post further when the new traffic has been stabilized.

  • [Aug 8, 2000, 12:49 am] SAVVIS Update.

    At approximately 11:11 pm (EST) Our SAVVIS connection started to return to normal service.

    We have been informed by a SAVVIS representative that the problem was caused by an outage in their ATM and Frame Relay switches here in the Pittsburgh Area.

  • [Aug 7, 2000, 11:08 pm] SAVVIS Outage

    Beginning around 6:30pm, SAVVIS lost connectivity to Pittsburgh once again. This outage, like the previous incident on Sunday, affects all Pittsburgh customers of SAVVIS. SAVVIS has assured us that the outage is being handled at the highest possible level, and they are working with their circuit carrier to attempt to resolve the problem.

    We have the turn-up of our DS-3 circuit to Genuity scheduled for Tuesday morning; in the event that SAVVIS is unable to restore service by that time, Genuity will be handling that traffic instead. In the meantime, traffic is flowing over alternate paths with no adverse effect.

    We will post more information as it becomes available.

  • [Aug 7, 2000, 6:43 pm] SAVVIS Outage

    Our SAVVIS line went down again approximately ten minutes ago, and there is no current estimated uptime. It is unknown yet whether or not this problem is related to the previous problem experienced today, but we will post once we know more information.
  • [Aug 7, 2000, 8:17 am] SAVVIS Resolution

    Our circuit to SAVVIS came back online at about 6:00am Eastern Time. A representative from SAVVIS informed us that this trunking problem should not occur again, as they will have the site where the difficulty is occuring staffed more thoroughly in the future.
  • [Aug 7, 2000, 4:03 am] SAVVIS Outage

    We are currently seeing a drastic reduction of the amount of traffic flowing on our SAVVIS circuit due to a trunk problem in the New York-Chicago corridor, and our service with them has been down for approximately 20 minutes as a result. In the meantime, traffic is taking alternate paths with no degradation of service.
  • [Aug 6, 2000, 11:04 pm] arda downtime

    arda (www62) crashed under heavy load, and was brought back up cleanly. Downtime was 10 minutes.
  • [Aug 6, 2000, 2:02 pm] SAVVIS Outage

    SAVVIS has traced their trunking problem to a router failure which they are currently working to repair. They expect to restore service by 5pm Eastern time. Traffic is flowing over alternate paths during this outage.
  • [Aug 6, 2000, 8:28 am] SAVVIS Outage

    SAVVIS is currently experiencing a trunk problem in the New York-Chicago corridor, and our service with them has been down for approximately 15 minutes as a result. They expect to have the matter resolved within an hour; in the meantime, traffic is taking alternate paths with no degradation of service.
  • [Aug 5, 2000, 6:46 pm] jarre downtime

    jarre (www172) crashed today under high load. It was brought back up cleanly. Total downtime was 10 minutes.
  • [Aug 2, 2000, 3:24 am] tinne Downtime

    tinne (www21) crashed under high load at 2:46am and had to be rebooted. Was brought back up to full activity at 3:13am
  • [Aug 1, 2000, 3:49 pm] umbar Downtime

    umbar (www46) crashed under load, requiring a reboot. Downtime was approximately 10 minutes.
  • [Jul 31, 2000, 11:39 pm] nuumen Downtime

    nuumen(www55) reset itself due to heavy load. Downtime was less then 10 minutes.
  • [Jul 31, 2000, 4:45 pm] theta Downtime

    theta(www4) crashed under heavy load, and required extensive filesystem repair before it could be rebooted. Total downtime was approximately 25 minutes.
  • [Jul 31, 2000, 12:30 am] kodh Downtime

    kodh(www90) rebooted under heavy load, and required extended file system cleaning before being returned to normal operation. Downtime was about 30 minutes.
  • [Jul 29, 2000, 12:48 am] idad Upgrade

    After a third crash in less than an hour, idad has been upgraded to a Pentium III at 733 MHz with 256MB SDRAM. We do not expect any further stability problems from this server.
  • [Jul 28, 2000, 11:18 pm] idad Downtime

    idad has crashed and quickly returned to service three separate times today. As there is no sign of any load problem, we suspect a hardware issue such as flaky RAM or a failing motherboard. We will be swapping the system out within the next 72 hours.
  • [Jul 28, 2000, 9:30 pm] thurisaz Downtime

    thurisaz (www136) crashed under heavy load. It has been returned to normal service and had a downtime of about 5 minutes.
  • [Jul 28, 2000, 6:40 pm] derba Downtime

    derba (www104) crashed under heavy load. It has since been returned to normal operation with a downtime of under 10 minutes.
  • [Jul 28, 2000, 1:24 pm] jarre Downtime

    jarre (www172) crashed under heavy load, and had to be rebooted. Total downtime was approximately 10 minutes.
  • [Jul 27, 2000, 3:57 am] idad downtime

    idad (www32) crashed under high load and had to be rebooted. Total downtime, approximately 15 minutes.
  • [Jul 26, 2000, 1:30 pm] cele Downtime

    cele (www73) crashed under heavy load, and had to be brought back up manually. Total downtime was under 15 minutes.
  • [Jul 25, 2000, 10:09 pm] straif downtime

    straif (www26) crashed under heavy load, and was brought back up. Total downtime was five minutes.
  • [Jul 23, 2000, 8:19 pm] gao Downtime

    gao (www111) was rebooted this evening to clear up a network problem. Total downtime was less than 5 minutes.
  • [Jul 23, 2000, 4:37 am] uilen Downtime

    uilen crashed under heavy load, and required extensive manual filesystem cleaning before returning to service. Downtime was approximately 40 minutes.
  • [Jul 22, 2000, 7:09 pm] parma downtime

    parma (www42) was brought down for a routine upgrade at 6:24PM EST. There was a problem with the upgrade, and the maintance has been deferred. Total downtime was 30 minutes.
  • [Jul 22, 2000, 5:15 am] Router Maintenance

    Between 4am and 5am Eastern time, routine upgrades were performed on one of our gateway routers. Customer traffic was temporarily rerouted but was not disrupted. These upgrades move us further towards the deployment of the Extreme Black Diamond switches.
  • [Jul 20, 2000, 1:26 am] Sprint Resolution

    Apparently the traffic reduction was due to some routine maintenance on Sprint's part which has since been taken care of. All appears to be well now.
  • [Jul 20, 2000, 1:05 am] Sprint Trouble

    We are currently seeing a drastic reduction of the amount of traffic flowing on our Sprint circuit. All customer traffic is flowing over alternate paths at this time, and we are working with Sprint to find and fix the problem.Further details will be posted as they become available.
  • [Jul 19, 2000, 10:57 am] pair2000 Update

    ipre (www181), sothor (www182), and khin (www184) have been converted to the new pair2000 system. Users on these servers may now control their account via the Account Control Manager, located at https://acc.pair2000.com/. Two other servers scheduled for today, aerre (www185) and linato (www186), have been delayed for one week while we make improvements to the pair2000 architecture. The current schedule may be found at http://www.pair2000.com/schedule.html
  • [Jul 19, 2000, 10:30 am] sowilu Downtime

    sowilu (www145) crashedunder heavy load at approximately 10:10 EDT. Downtime was approximately 10 minutes.
  • [Jul 18, 2000, 11:32 pm] laguz Downtime

    laguz (www148) crashed under heavy load. Downtime was less than 10 minutes.
  • [Jul 16, 2000, 5:02 pm] fearn Downtime

    fearn (www40) crashed under heavy load. After coming back up these problems lingered until tracked down to a misbehaving .procmailrc file. Total downtime was 30 minutes.
  • [Jul 16, 2000, 8:48 am] idad downtime

    idad (www32) crashed under high load at 8:33AM EST. Total downtime was 10 minutes.
  • [Jul 14, 2000, 7:58 am] Miva Upgrade

    Miva Empresa, the CGI binary used by Miva Merchant, has been upgraded to version 3.71 on all commerce servers. Miva Corporation reports that this new version patches several security problems their testing uncovered.
  • [Jul 14, 2000, 7:16 am] UUnet Update

    UUnet continues to have serious problems with one of their core routers in the Washington, DC area. Although we are assured that the matter has been escalated, and it appears to affect all of their traffic on the East Coast, there is still no projected window of repair, nor is there any updated information on the official UUnet Status page at http://www.noc.uu.net/.

    pair Networks, Inc prides itself on informing its customers honestly and directly, even in the case of significant failures such as the two incidents with switch failures in the past 24 hours. We are severely disappointed that UUnet cannot acknowledge the severity of any problem, provide proper escalation and feedback, and notify the public, or at least their customers, through the channels they have setup for that purpose.

  • [Jul 14, 2000, 7:11 am] Switch Failure

    One of our Extreme Networks switches went offline at approximately 5:50am today. The switch was completely non-responsive and had apparently lost power. The switch was swapped with one of our emergency spares, and is currently back in normal operation. Total downtime for the affected servers was approximately 50 minutes.

    The power supply of the switch has apparently failed; the A/C fuse was completely vaporized. Yesterday's switch problem appears to be related to faulty Flash memory. The power supply problem on a different switch this morning appears to be completely unrelated. We do not believe that either failure is related to the recent software upgrades performed on the switches; both failures seem to be hardware-related.

    We apologize for this repeated incident. We continue to have the greatest faith in the Extreme hardware, and will be replacing the failed hardware promptly.

  • [Jul 14, 2000, 6:12 am] Network Errors

    In a seperate incidnet not related to the network errors experienced on July 12, one of our switches has physically failed, affecting approximately 20 servers. The switch is currently being swapped, and the servers should be back online shortly.
  • [Jul 14, 2000, 2:17 am] UUnet Trouble

    Our UUnet traffic has taken a sudden downward spike five times in the past twelve hours. Upon contacting UUnet, they are reporting difficulties with a linecard in one of their core routers. The problem does not appear to be yet resolved, but we will continue to monitor the situation and report any further problems.
  • [Jul 13, 2000, 11:27 am] Switch Replaced

    The Extreme Networks switch that was failing has been replaced, and all affected servers are back in normal operation. Including time to reconfigure some servers, the swap took approximately 25 minutes. We will continue to monitor the new switch closely to ensure there are no further problems.
  • [Jul 13, 2000, 10:12 am] Switch Failure

    The lingering server problems have been traced primarily to a failing Extreme Networks switch. That switch is now being replaced on an emergency basis. Total downtime for affected servers should be less than ten minutes. Further information will be posted when the matter is resolved.
  • [Jul 13, 2000, 9:50 am] Server Maintenance

    We have discovered lingering network performance issues on a number of user servers, resulting from the new configuration of our switches after last night's software upgrade. We will be reconfiguring servers, and in some cases, taking them offline briefly for a network card upgrade, in order to eliminate this problem. User impact will be minimized to the greatest extent possible.
  • [Jul 13, 2000, 7:28 am] idad downtime

    idad (www32) crashed under high load at around 6:18am EST. Downtime approximately 20 minutes.
  • [Jul 12, 2000, 11:43 pm] ydhu downtime

    In a separate incident not related to the network troubles, ydhu (www83) rebooted under heavy load. After a manual file systems check, it has been brought online. Approximate downtime was 20 minutes.
  • [Jul 12, 2000, 11:40 pm] Network Problems Resolved

    The network problem we encountered earlier while upgrading our network switches has been resolved. Total downtime on affected servers was approximately 1 hour, and all user servers are back in full operation at this time.
  • [Jul 12, 2000, 11:01 pm] Network Problems

    During the course of routine maintaince, we upgraded software on our switches. The new software version didn't take on two of the switches, and we are backtracking the revision now.
  • [Jul 12, 2000, 10:50 am] pi downtime

    pi (www10) crashed under heavy load and required an extensive filesystem cleaning. Downtime was around 25 minutes.
  • [Jul 11, 2000, 11:24 am] uumor downtime

    uumor (www159) was rebooted to clear up a high load condition. Downtime was approximately one minute.
  • [Jul 10, 2000, 7:16 pm] Network Maintenance

    On Wednesday night, 7/12/00, from approximately 8-10pm we will be performing further maintenance on our network equipment. There will be less than a minute of total interruption of connectivity for each affected server.
  • [Jul 10, 2000, 4:16 am] cele downtime

    cele (www73) crashed under high load and had to be rebooted 3 times.
  • [Jul 10, 2000, 1:44 am] bemnet downtime

    bemnet (www158) crashed under high load, rebooted normally, total downtime, less than 20 minutes.
  • [Jul 6, 2000, 11:16 pm] iota downtime

    iota (www5) crashed under heavy load, and was brought back online with 10 minutes downtime.
  • [Jul 6, 2000, 7:52 am] kodh Resolved

    kodh's primary drive suffered an apparent failure at approximately 6:30am. After reseating all cables, the drive came back online and booted after an extensive filesystem cleaning. We do not believe the problem will persist. Total downtime was approximately fifty minutes.
  • [Jul 6, 2000, 6:50 am] Downtime for kodh

    A drive has failed on kodh (www90). We are currently working to restore it from backup. More information will be posted as it is available.
  • [Jul 6, 2000, 6:38 am] idad Downtime

    idad (www32) crashed under heavy load and was brought back online with downtime around 30 minutes.
  • [Jul 3, 2000, 3:33 pm] Network Maintenance

    We will be performing routine maintenance on our network switches Thursday night (7/6/00) from 8-10pm. Customer traffic should not be adversely affected.
  • [Jul 1, 2000, 11:33 am] xi Cleanup

    Cleanup of all files on xi has been completed. Some files were restored from Friday's backup, while others were recovered directly from the failing drives. Logs for June 30th have been generated, and all mail has been restored (older mail was temporarily missing). Any user who encounters problems should report them to support@pair.com or urgent@pair.com as appropriate.

    We apologize for the difficulty caused by this drive failure. We believe damage was minimal, and the server is now back in regular service with a faster drive, more memory, and a more powerful CPU.

  • [Jul 1, 2000, 9:49 am] emancholl Downtime

    emancholl(www37) crashed under heavy load, and required manual disk cleaning before bringing it back online. Downtime was 50 minutes.
  • [Jul 1, 2000, 1:28 am] xi emergency maintenance

    Restoration of files from xi's failed hard drive is continuing, and about 3/4 done. User access has been restored to the service, though not all files have been restored yet. In the process of the maintenance, we took the opportunity to upgrade xi to a Pentium III 667 MHz with 256 MB of RAM.
  • [Jun 30, 2000, 8:36 pm] xi maintenance

    During the drive swap on xi to replace it's failing /u3 drive, the drive died. We are currently restoring the data from backups, and should have it fully operational again shortly. Another notice will be posted at that time.
  • [Jun 30, 2000, 2:51 pm] xi Maintenance

    Effective immediately, we are beginning a drive swap on xi, which is showing drive errors on the /u3 partition. Initial downtime will be approximately ten minutes, and the swap will be completed on Saturday, with another brief downtime. This will result in more disk space, memory, and CPU capacity for xi. Customer traffic should not be significantly affected.
  • [Jun 30, 2000, 2:47 pm] one.pairlist.net Maintenance

    In light of recent crashes and data corruption on one.pairlist.net, it is being replaced with a temporary server effective immediately. The new server is somewhat less powerful (Pentium II at 450MHz), but will be upgraded as load on the system requires. Downtime will be less than ten minutes.
  • [Jun 30, 2000, 2:38 pm] news.pair.com Failure

    The power supply on news.pair.com failed. The system drive was promptly moved to a new chassis, and is now running as a Pentium III at 667MHz, with 256MB RAM. We will be building an entirely new news server in the near future.
  • [Jun 30, 2000, 1:57 pm] pairlist.net Outage

    While working to resolve a mailing loop on a particularly large mailing list, a piece of software supporting Mailman on pairlist.net became corrupted. It was promptly reinstalled, and the offending list has been suspended from service.

    There was a brief period during which mail to all lists bounced. We are planning an upgrade to Mailman 2.0 as soon as it is released.

  • [Jun 29, 2000, 11:40 pm] ifin upgrade

    ifin (www36) has been upgraded to a Pentium III 667 MHz, with 256 MB of RAM. Downtime was under five minutes.
  • [Jun 29, 2000, 11:16 pm] or upgrade

    or (www34) was upgraded to a Pentium III 667 MHz, with 256 MB of RAM. Downtime was under five minutes.
  • [Jun 29, 2000, 10:51 pm] ebad upgrade

    ebad (www33) has been upgraded to a Pentium II 667 MHz, with 256 MB of RAM. Downtime was under five minutes.
  • [Jun 29, 2000, 10:27 pm] edad upgrade

    edad (www31) has been upgraded to a Pentium III 667 MHz, with 256 MB of RAM.
  • [Jun 29, 2000, 6:50 pm] thuule downtime

    thuule (www49) crashed at 6:09 pm EST; it was brought back online with 15 minutes of downtime.
  • [Jun 26, 2000, 11:24 pm] ur upgrade

    ur (www30) has been upgraded to a Pentium III 667 MHz, with 256 MB of RAM. Downtime was under five minutes.
  • [Jun 26, 2000, 10:53 pm] muin upgrade

    muin (www23) has been upgraded to a Pentium III 667 MHz, with 256 MB of RAM. Downtime was under five minutes.
  • [Jun 26, 2000, 7:47 pm] glikk upgrade

    glikk (www119) was taken down briefly for a drive upgrade. The new total disk space on this server is increased to 20.5GB.
  • [Jun 25, 2000, 8:57 pm] ungwe downtime

    ungwe (www48) crashed around 1AM. It was brought back online with 30 minutes of downtime.
  • [Jun 24, 2000, 3:28 pm] xi Downtime

    xi (www8) crashed, requiring a manual disk cleaning. It was brought back online with 15 minutes of downtime.
  • [Jun 22, 2000, 5:33 pm] UUnet Resolution

    UUnet's line has returned to normal service as of about 5:00PM EST. A representative from UUnet has been in contact with us to describe to us the cause of the failure.

    As stated previously we will be working even faster now to add extra connectivity, so as to reduce the effect of future outages.

  • [Jun 22, 2000, 12:07 pm] UUnet cont.

    The problem has been identified as with UUnet's Pittsburgh POP, and not with any of our equipment. Work is currently in progress, and we are awaiting status reports from them on the progress of this outage.
  • [Jun 22, 2000, 11:49 am] UUnet outage

    We have lost contact with UUnet's Pittsburgh POP. We are currently working with UUnet to identify the source of the problem and repair. We will post additional details as they become available.
  • [Jun 20, 2000, 11:07 am] UUnet Restored to Service

    After an outage of almost exactly seventeen hours, UUnet's connectivity to Pittsburgh has been restored. We apologize for the inconvenience to our customers and their site visitors. We are working hard to expand our connectivity to further reduce the possible effects of any future outages.
  • [Jun 20, 2000, 10:36 am] UUnet Update

    We are now in the 17th consecutive hour of UUnet's complete Pittsburgh outage. The latest report from UUnet is that as of 9:30am Eastern, fiber splicing has been completed, and end-to-end testing has commenced. They expect a return to normal service sometime this morning.

    We have accelerated our Genuity DS-3 to next Monday, June 26, and expect also to have our Sprint OC-3c online as early as the end of July. We are guiding our bandwidth expansion decisions by UUnet's apparent problems in maintaining connectivity to our city. It is unacceptable for their connectivity here to be vulnerable, as it apparently is, to a single point of failure.

  • [Jun 20, 2000, 1:58 am] UUnet Update

    WorldCom is reporting an outage of eleven OC-48 long-haul circuits, effectively partitioning UUnet's Pittsburgh POP from the rest of their network. This is the same problem that occurred in March of this year. We are taking steps with UUnet to reduce the impact of such future outages, and will be favoring other providers not only during the outage, but in future purchasing decisions as well.

    We will post further information when it becomes available.

  • [Jun 19, 2000, 6:20 pm] UUnet Outage

    Our UUnet is currently down. We are working with a technician to bring our line back up. We only have information currently about an outage in the Pittsburgh area and will post more updates once we have more data.
  • [Jun 19, 2000, 2:30 pm] idad maintenance completed

    The harddrive upgrade of idad (www32) has been completed. The new total disk space on this server is 20.5GB.
  • [Jun 19, 2000, 12:41 pm] UUnet Traffic

    Due to an unfortunate two-week delay in the final cross-connect of our OC-3c circuit to UUnet, we are currently experiencing saturation and latency on our UUnet DS-3 circuit. We have shifted some traffic to Digex and Sprint in order to compensate; this will reduce but not eliminate the problem.

    We expect to be able to turn up the new circuit on the morning of Tuesday, June 20. The new circuit will eliminate the problem condition. We also expect to have our new DS-3 circuit to Genuity active within the next week, and the OC-3c circuit order for Sprint was recently accelerated to a target date of late July.

    We apologize for the temporary inconvenience. We are doing everything possible to manage the bandwidth demand.

  • [Jun 16, 2000, 3:15 pm] idad maintenance

    idad (www32) was taken down briefly for a hard drive upgrade. There will be one more brief downtime to complete this process once the swap is finished.
  • [Jun 14, 2000, 11:46 am] jarre downtime

    jarre (www172) crashed under heavy load, and came back without incident after a filesystem check.
  • [Jun 13, 2000, 12:25 pm] pi downtime

    pi (www10) crashed under heavy load and was brought back online with downtime around 15 minutes.
  • [Jun 13, 2000, 10:03 am] db3 Upgrade

    db3 has been upgraded to FreeBSD 3.4 and MySQL 3.23.11. All database servers not running under 3.4 will be upgraded in the near future, in accordance with our database upgrade plan.
  • [Jun 13, 2000, 8:50 am] db3 Maintenance

    db3 will be taken down briefly this morning for maintenance.
  • [Jun 11, 2000, 7:07 am] Digex Resolution

    The problem with inbound traffic on Digex was traced to the resolution of a pre-existing problem, not the beginning of a problem. We are not seeing any routing or traffic problems with Digex at this time.
  • [Jun 11, 2000, 12:03 am] Digex Trouble

    Beginning around 8:45pm Eastern time, a precipitous drop in inbound traffic from Digex was observed. The missing traffic, which has not appeared on alternate inbound paths, appears to be destined for hosts in our 209.68.0.0 netblock. We have been working with Digex technical support in order to determine if there is a routing problem or filter in their network, but so far nothing has been determined.

    Any customer who believes they may be experiencing problems as a result of this is requested to send a traceroute to urgent@pair.com for prompt handling.

  • [Jun 9, 2000, 9:40 am] gebo downtime

    gebo (www139) crashed under heavy load and was brought back online. Downtime was around 10 minutes.
  • [Jun 8, 2000, 6:24 pm] cuzea downtime

    cuzea (www94) crashed under heavy load and has been brought back online. Downtime was approximately 15 minutes.
  • [Jun 8, 2000, 11:35 am] chi Downtime

    chi (www14) crashed under load. Downtime was approximately 15 minutes.
  • [Jun 7, 2000, 6:29 pm] UUnet Update

    As of 5:45pm, traffic to and from UUnet has returned to normal levels. UUnet has advised us that two of their backbone routers went out of service, precipitating this incident.

    We will proceed with our plans to shift inbound traffic to avoid this type of problem with UUnet.

  • [Jun 7, 2000, 4:55 pm] UUnet Problems

    Beginning around 4:40pm Eastern, we have observed a major drop in inbound UUnet traffic, and no corresponding rise in traffic through other providers. This implies that, once again, UUnet's network is partitioned and their internal routing is not properly dropping our routes. This means that UUnet peering points around the world will accept traffic destined for pair Networks, but then fail to deliver that traffic, effectively blackholing much of our traffic.

    We will be accelerating efforts to resolve this technical problem with UUnet; we have expressed our displeasure numerous times before. We will likely shift our inbound traffic primarily to Sprint in order to compensate for this effect, in the near future.

    Further information about the current brownout will be posted as it becomes available.

  • [Jun 6, 2000, 5:35 pm] parma Web Service

    Some web sites on parma (www42) experienced intermittent downtime this morning and afternoon due to a configuration error. This has been corrected, and the situation should not recur.
  • [Jun 6, 2000, 3:17 pm] Sprint Problems

    Sprint is advising all of its customers of a major outage on the East Coast, which may affect latency on some portions of their network. Although we have not seen any significant change in traffic, we are passing the notice along to any customers who may be affected.
  • [Jun 6, 2000, 2:00 pm] db3 Maintenance

    db3 was taken down briefly to install a hard drive, as the first step towards its upgrade to FreeBSD 3.4 and MySQL 3.23 (as described on the database upgrade plan). Downtime was approximately 10 minutes.
  • [Jun 6, 2000, 10:18 am] UUnet OC-3c Upgrade

    Beginning at approximately 8pm Eastern time today, we will be interrupting our UUnet connectivity briefly as we perform the first stage of our long-awaited OC-3c upgrade to UUnet. After installation and testing of the necessary equipment, the circuit switchover will be accomplished later this week. This will provide a significant increase in our capacity to UUnet, and allow us to improve overall route selection.

    Please note that we have additional network upgrades scheduled in the near term, including DS-3 to Genuity, and OC-3c to Sprint. Details are published at http://support.pair.com/notices/upgrades.html.

  • [Jun 6, 2000, 6:54 am] falku upgrade complete

    The drive swap on falku (www98) is now complete at 6:50am EST.
  • [Jun 6, 2000, 6:43 am] falku upgrade

    falku (www98) was down from 1:35am EST to 1:46am EST for the new hard drive installation. A second reboot will be performed at 6:45am EST to bring the new hard drive into live service.
  • [Jun 6, 2000, 12:55 am] falku upgrade

    falku (www98) will be down for a short period tonight while we perform a hard drive upgrade to solve space issues on that server. Expected downtime is under 30 minutes, and will begin around 1:30am EST.
  • [Jun 5, 2000, 4:11 pm] vuae reboot

    vuae (www93) was rebooted to clear some file systems problems resulting from a mail loop. Downtime was less than 5 minutes, and it has since resumed normal operations again.
  • [Jun 5, 2000, 3:26 pm] uilen Downtime

    uilen (www35) was down for 25 minutes this afternoon after a mailing loop and an extensive file cleaning. It is now back up and functioning normally.
  • [Jun 5, 2000, 6:18 am] xi Reboot

    xi (www8) was rebooted to clear a conflict. Downtime was approximately 5 minutes.
  • [Jun 5, 2000, 4:56 am] kappa Downtime

    kappa (www6) crashed under load and was brought back online after a systems check. Downtime was approximately 30 minutes.
  • [Jun 4, 2000, 5:56 pm] tinne Downtime

    tinne (www21) was down this afternoon for approximately 10 minutes. It has since returned to normal service.
  • [Jun 4, 2000, 3:48 am] onn Downtime

    onn (www29) crashed and was brought back online after a manual filesystem check. Downtime was approximately 30 minutes.
  • [Jun 2, 2000, 5:13 pm] beith Upgraded

    The drive swap on beith has been completed, along with a chassis swap to upgrade the system. beith is now a Pentium III at 600 MHz, with 256MB SDRAM and 20GB drive space.
  • [Jun 2, 2000, 12:35 pm] kodh downtime

    kodh (www90) crashed under heavy load and was restored to service with downtime around 15 mintes.
  • [May 27, 2000, 11:05 am] beith Maintenance

    One of beith's hard drives is reporting intermittent errors; we have begun the process of swapping both drives for a single larger model. This will be completed by Monday, and involves no appreciable downtime.
  • [May 26, 2000, 9:07 pm] epsilon Downtime

    epsilon (www3) crashed under heavy load. Downtime was 15 minutes.
  • [May 22, 2000, 5:30 pm] UUNet problems

    We have noticed problems with our UUNet backbone dropping some traffic. According to Technical Support, a number of their backbone routers lost their BGP sessions. They are still working on it and were not able to offer a specific estimate on when the situation would be improved. We have received a master ticket for the situation and will continue to monitor it closely.
  • [May 19, 2000, 8:49 pm] cicka downtime

    cicka (www167) rebooted under heavy load. It was brought back online with downtime under 10 minutes.
  • [May 19, 2000, 9:33 am] UUnet Status

    After extensive testing last night, UUnet has been unable to identify the problem with our circuit, which is experiencing brief outages, long enough to interrupt routing, every two to six hours. We will be working with them again today to attempt to resolve the issue. This does not have a major effect on customer traffic, but is not acceptable to us, either.
  • [May 18, 2000, 6:47 pm] UUNet Problems

    We are currently experiencing a series of short (10-20 second) outages on our DS-3 to UUNet. They are investigating the problem and it should be resolved by morning. Customer traffic is flowing over alternate circuits during these brief outages, and should not be adversely affected. Further information will be posted as it becomes available.
  • [May 15, 2000, 11:20 am] pi downtime

    pi (www10) crashed under heavy load and was brought back online with downtime around 20 minutes.
  • [May 15, 2000, 2:12 am] uilen Downtime

    uilen (www35) crashed under heavy load. It was brought back online within 5 minutes.
  • [May 13, 2000, 1:23 am] anca downtime

    anca (www53) rebooted under heavy load. It was quickly brought back online with downtime under 10 minutes.
  • [May 11, 2000, 3:26 pm] pi downtime

    pi (www10) crashed under heavy load and was brought back online with downtime around 20 minutes.
  • [May 11, 2000, 12:50 am] SAVVIS Network Problems

    From approximately 8pm-Midnight tonight, we were seeing some problems with our DS3 to SAVVIS due to faulty filters on their gateway router. Customer traffic should not have been adversely affected, though there were times when some traffic would have been being routed to SAVVIS, and not accepted for delivery. The problem has been found and corrected at this time, and we will be monitoring the circuit closely for the next few days to ensure that no further problems recur.
  • [May 10, 2000, 3:48 pm] dyyme downtime

    dymme (www71) was down this afternoon for approximately 10 minutes. It has since returned to normal operation.
  • [May 10, 2000, 2:04 pm] UUNet Update

    According to UUNet, the problem reported earlier were related to the reload of a backbone in the DCA area. They believe that everything should at this time be stable.
  • [May 10, 2000, 12:43 pm] UUnet Problems

    UUnet traffic has dropped by more than 50% in the past fifteen minutes, leading to an overall drop in outbound traffic. This suggests another significant outage has taken place somewhere in UUnet's network, and because of their non-dynamic internal routing, they may be blackholing some traffic destined for our network.

    We will be pursuing this issue with UUnet management. Outages elsewhere in the UUnet network should not have this degree of effect; nor do other backbones experience the blackhole problem. We will post more information on this brownout as it becomes available.

  • [May 8, 2000, 5:51 pm] tinco maintenance completed

    The drive maintenance on tinco has been completed, with an upgrade to 20.5GB of drive space. If any problems are detected as a result of this operation, please let us know at support@pair.com.
  • [May 5, 2000, 2:37 pm] tinco maintenance

    tinco was brought down briefly for drive maintenance. There will be another short downtime once this is completed, which will result in an overall upgrade to drive space for this server. If any problems are found with user files during this transfer, please let us know at urgent@pair.com.
  • [May 5, 2000, 9:34 am] pi downtime

    pi (www10) crashed under heavy load and was brought back online with downtime around 30 minutes.
  • [May 4, 2000, 12:04 pm] psi downtime

    psi (www15) crashed under heavy load, and was brought back online with downtime under 10 minutes.
  • [May 4, 2000, 7:59 am] pi Logs

    Logs for May 2 and May 3 have been generated on pi early this morning. There was an unusual configuration problem which should not recur.
  • [May 3, 2000, 2:01 am] gao downtime

    gao (www111) was brought down for a physical relocation after Saturday's maintenance. Downtime was less then 10 minutes.
  • [May 1, 2000, 11:28 pm] UUnet Problems

    UUnet is reporting serious problems throughout their network; serious enough that they have promptly posted the information on their NOC page. The impact on our network has been a temporary loss of inbound traffic via UUnet, which caused a corresponding drop in overall outbound traffic. The duration of the event was approximately fifteen minutes, and traffic is now returning to normal.

    This sort of incident is made worse by the behavior in UUnet's network that we have pointed out before; when internal connectivity to a destination is lost, routes for that destination are not promptly withdrawn. Consequently, UUnet edge nodes will happily accept traffic destined for our network even when there is no way for UUnet to deliver it to us; this prevents other networks from selecting an alternate route and defeats the stability otherwise provided by BGP. Unfortunately, many networks rely on UUnet to reach our network because of the otherwise superior performance it provides.

    We will post further details if any are provided by UUnet.

  • [Apr 30, 2000, 1:38 am] gao Update

    We have discovered that some users on gao did not have their accounts fully restored. We will be working through the night to identify the affected accounts and restore them from backup. We do have current backups of all data; the most recent level was from Saturday morning, less than 24 hours before the crash.
  • [Apr 29, 2000, 11:11 pm] Network Resolution

    As of 10:05pm Eastern time, all network service has been restored to normal. The ultimate cause was traced to problems with CEF, Cisco Express Forwarding, on two of our four internal routers. CEF routing tables on these routers became corrupted sometime on Friday, although initially the impact was very small, with only a handful of addresses affected. When the problem was discovered at 5:30pm today, attempts to diagnose and remedy the problem led to considerably worse results, with nearly all of our network affected at one point.

    Although the attempted cure was at least temporarily worse than the disease, several other problems, which may have been related, were cleared up as we worked on the main problem. One reconfiguration required some work with our upstreams to ensure that they were receiving our announcements correctly. Several pieces of equipment, as well as cabling, were swapped during this time as well.

    The problem has been eliminated from our network as far as we can tell. Unfortunately, there is no specific way to monitor for CEF discrepancies, but if this type of problem should recur, we will know immediately where to look. As an aside, we have been successfully using CEF on these routers since July 1999. Although many providers report problems (CEF is often referred to as "Customer Enragement Feature"), this is the first incident we've experienced. Unfortunately, it was a major incident.

    The routers affected this evening happen to be among the equipment scheduled to be replaced by the carrier-class Black Diamond switches we already have on order. Details are available at http://support.pair.com/notices/upgrades.html

    Just to make things more interesting, the primary hard drive on gao died shortly after the routing problems began. The incidents are completely unrelated, but poorly timed. gao's data has been restored from backup on a new drive, and it is now running with no problems.

    Any customer who observes any ongoing problem reaching any site is encouraged to send a traceroute to urgent@pair.com. We will investigate any possibly related problems immediately.

    We apologize for the interruption in service and its potentially significant impact. We recognize the critical importance of maintaining the best possible network service at all times, and we are committed to doing so, now and in the future.

  • [Apr 29, 2000, 9:10 pm] Network Update

    As of approximately 8:45pm, the original source of failure has been eliminated. We have continuing suspicions about the behavior of one particular router, however, and may be swapping it out later this evening.

    There is an ongoing problem with traffic routing from certain addresses to UUnet's network. This is a BGP issue, and the fault lies with UUnet. We are working with them to resolve the issue at this moment.

    Further details will be posted.

  • [Apr 29, 2000, 7:56 pm] Network Update

    We have narrowed down the internal network problem to a specific central switch which appears to be operating unreliably. We are in the process of swapping this switch out for a spare. Although behaving normally, the switch is resolutely dropping 60-80% of traffic destined for specific IPs, without rhyme or reason. This is a switch that was already planned for upgrade.

    gao is still being rebuilt, and should be back online within the next two to three hours. Its primary drive failed entirely, and a new drive is being built from backups.

  • [Apr 29, 2000, 6:27 pm] Network Troubles

    Beginning around 5:30pm today, we discovered unusual network behavior involving an internal router. Initially, only two dedicated servers were affected, as well as some internal systems. In the course of our investigation, one problem was corrected, but overall the matter has worsened, with more servers being affected. We are continuing to work diligently on this problem, and will post more information as it becomes available. We believe this problem is solely internal and can be resolved without further interruptions of service.
  • [Apr 29, 2000, 6:14 pm] gao emergency maintenance

    Gao's hard drive has failed under load and is currently being replaced. Estimated downtime to copy the data from backups to the new drive is approximately 1-3 hours.
  • [Apr 28, 2000, 12:39 pm] UUnet Resolution

    After an brownout of approximately one hour overall, all traffic through UUnet appears to have returned to normal. We apologize for the inconvenience; because of UUnet's internal routing scheme, backbone outages do not effectively propagate withdrawn routes. This means that other portions of UUnet's network will happily accept traffic destined for a customer network such as our own, and then fail to deliver that traffic due to a backbone outage. We have observed this behavior before, and are considering ways to work around it in the future.
  • [Apr 28, 2000, 12:05 pm] UUnet Further Troubles

    The recovery of UUnet's routing was short-lived. They have indicated to us that they are currently having more serious backbone problems, and a ticket has been opened. As further information becomes available, we will post it here.
  • [Apr 28, 2000, 11:47 am] UUnet Resolution

    UUnet has advised us that one of their national transit routers failed, interrupting a portion of their network until routing sessions could be re-established. When this happens, some traffic flowing through UUnet will be delayed or diverted for a period of 30-45 minutes. We are now seeing traffic levels recover to normal.
  • [Apr 28, 2000, 11:43 am] UUnet Trouble

    We are seeing a noticeable drop in traffic across all circuits, due to apparent inbound problems on UUnet's network. We are watching this carefully and will post further information as it becomes available.
  • [Apr 27, 2000, 12:07 pm] pi downtime

    pi (www10) crashed under heavy load and was brought back online with downtime around 20 minutes.
  • [Apr 25, 2000, 4:11 pm] pi downtime

    pi (www10) crashed under heavy load and was brought back online with downtime around 20 minutes.
  • [Apr 25, 2000, 2:32 pm] theta downtime

    theta (www4) crashed under heavy load, and was brought back online with downtime around 20 minutes.
  • [Apr 23, 2000, 7:14 pm] theta Downtime

    theta crashed under heavy load and was promptly brought back online. Total downtime was less than fifteen minutes.
  • [Apr 22, 2000, 10:41 am] Digex Network Warning

    Digex/Intermedia has warned us of a problem with their network in the Washington DC area. An electrical fire at a Metro Station has removed at least one circuit from service, but their technicians will not be allowed in the area to address the problem until Sunday morning. Consequently, there may be increased latency due to load on other portions of their network. We have not seen any problems from our end, but this information may be useful to any customers affected.
  • [Apr 21, 2000, 10:11 pm] Digex Slowdown

    Intermedia/Digex has informed us that due to fiber damaged in a fire in Washington, DC we can expect to see some additional latency in their network. This means that some customer traffic might be noticably slower until they are able to repair the damaged fiber on Sunday morning. Despite the slight slowdown, however, all traffic will continue to reach its destination as normal.
  • [Apr 17, 2000, 12:17 pm] pi downtime

    pi (www10) crashed under heavy load. After an extensive filesystem cleaning it was brought back online, with downtime around 20 minutes.
  • [Apr 13, 2000, 9:22 pm] omega Downtime

    omega (www16) crashed under heavy load. Downtime was 15 minutes.
  • [Apr 13, 2000, 1:24 pm] UUNet Performance

    We've observed some performance problems this afternoon on our DS-3 to UUNet. UUNet has informed us that they are due to the link between their Pittsburgh Router and one of their gateway routers having gone down briefly. We will be closely monitoring the circuit for the rest of the day to ensure that customer traffic is not adversely affected.
  • [Apr 13, 2000, 12:45 pm] bemnet Downtime

    bemnet (www158) crashed under load and was brought back online with downtime of approximately 10 minutes.
  • [Apr 12, 2000, 7:34 pm] nuin downtime

    nuin (www18) crashed under heavy load and was brought back online after a file system cleaning. Downtime was approximately 10 minutes.
  • [Apr 11, 2000, 4:25 pm] yanta downtime

    yanta (www67) crashed under heavy load, and was brought back online with downtime around 10 minutes.
  • [Apr 11, 2000, 12:47 pm] mu Maintenance Completed

    The drive maintenance on mu has been completed, after another 10 minutes of downtime. mu is now a Pentium III at 600 MHz, with 256MB SDRAM and a 20GB drive.
  • [Apr 11, 2000, 6:16 am] tinco Upgrade

    tinco has been upgraded to a Pentium III at 600 MHz with 256MB SDRAM. Downtime for the upgrade was approximately ten minutes.
  • [Apr 11, 2000, 4:18 am] mu Emergency Maintenance

    Emergency maintenance has begun on mu, to replace its two older hard drives with a new model. One of the drives is experiencing unrecoverable errors. When this process is complete, within several hours, the server CPU and RAM will also be upgraded.
  • [Apr 11, 2000, 3:34 am] cuzea Downtime

    cuzea crashed under heavy load, and was brought up after an extensive filesystem cleaning. Total downtime was approximately 30 minutes.
  • [Apr 11, 2000, 1:12 am] Sprint Resolution

    At 12:35am Eastern time, our Sprint circuit was restored to normal service. Sprint had significant difficulties identifying the problem in the routing equipment on their end. We have been assured that the outage, nearly ten hours in duration, is an unusual situation and will not likely be repeated. We will continue to monitor the circuit closely.
  • [Apr 10, 2000, 9:22 pm] Sprint Outage

    The Sprint circuit is still down, with no ETA for repair. Sprint has now escalated the problem several times, but has yet to find its source other than that it is with one of their gateway routers. They assure us that they are working to get it back online as soon as possible. Until then, customer traffic is continuing to flow over our other circuits. We will continue to post more information as it becomes available.
  • [Apr 10, 2000, 4:14 pm] Sprint Outage

    We've just discovered that the Sprint outage is worse than we thought. While they are accepting some outbound traffic from us, it is then never leaving their network. This is effectively causing some customer traffic to be lost entirely. We are still working with them to isolate and fix the problem as quickly as possible, and will post another notice as soon as we have more information.
  • [Apr 10, 2000, 3:54 pm] Sprint Outage

    We are currently seeing a drastic reduction of the amount of traffic flowing on our Sprint circuit. All customer traffic is flowing over alternate paths at this time, and we are working with Sprint to find and fix the problem. Further details will be posted as they become available.
  • [Apr 10, 2000, 5:06 am] Sprint Network Outage - Resolution"

    Connectivity through Sprint has been restored at 4:48 am EST. Total downtime for the connection was 159 minutes.
  • [Apr 10, 2000, 2:58 am] Sprint Network Outage

    At approximately 2:00 am EST, connectivity through Sprint was lost. The issue has been escalated to Sprint for resolution. Further details will be posted when they become available.
  • [Apr 8, 2000, 4:44 pm] emancholl Upgrade

    emancholl has been upgraded to a Pentium III at 600 MHz, with 256MB SDRAM and a 15GB drive.
  • [Apr 8, 2000, 3:38 pm] harma Upgrade

    The drive swap and system upgrade of harma has been completed as of 3:35pm Eastern time. The server is now a Pentium III at 600 MHz, with 256MB SDRAM and a 20GB drive.
  • [Apr 8, 2000, 12:52 pm] harma Maintenance

    Due to recent errors, we are replacing the hard drive in harma. There will be ten minutes of downtime later today, as the swap is completed.
  • [Apr 7, 2000, 10:38 pm] emancholl Downtime

    emancholl (www37) suffered a crash and was rebooted. Downtime was 10 minutes.
  • [Apr 7, 2000, 4:11 pm] cicka Reboot

    cicka (www167) required a reboot after a manual reconfiguration problem. After the reboot, it has returned to regular service. Downtime was just over 15 minutes.
  • [Apr 7, 2000, 1:17 pm] enda downtime

    enda (www80) crashed under heavy load and was brought back online with downtime around 20 minutes.
  • [Apr 5, 2000, 2:13 pm] Routing Interruption

    One of our gateway routers went down briefly at 1:35pm Eastern time today. Slight reconfiguration was required when it returned to service. We are investigating possible causes.
  • [Apr 4, 2000, 5:01 pm] Digex Problems

    Beginning around 4:50pm, we have observed severe latency on Digex's backbone, as well as a sharp decrease in routes announced and traffic passed. We believe there is a general problem on their network. We will post more information as it becomes available. In the meantime, alternate paths are handling traffic as necessary.
  • [Apr 4, 2000, 3:34 pm] kodh downtime

    kodh (www90) crashed under heavy load and was brought back online with downtime around 20 minutes.
  • [Apr 2, 2000, 11:51 am] Server Upgrades

    ngetal, eite, and ceirt have been upgraded to Pentium III 600 MHz systems with 256MB SDRAM each. This is part of our ongoing upgrade plan.
  • [Mar 29, 2000, 11:25 pm] pyyl Downtime

    pyyl (www97) crashed due to heavy load. Downtime was 10 minutes.
  • [Mar 29, 2000, 5:47 am] onn Downtime

    onn crashed four times between 4am and 5:30am Eastern time today. Although we initially suspected a hardware problem and were preparing to swap onn into a new chassis, the crashes were in fact traced to an extremely high network load, more than 100 times beyond the normal traffic carried by that server. This load was caused by an extremely popular file being posted to an FTP account and downloaded through HTTP. Please note that FTP is more suitable for lengthy downloads. Also, any customer who needs services in the 200GB/day range is asked to please contact us beforehand for an appropriate QuickServe solution.
  • [Mar 28, 2000, 9:51 pm] UUnet Resolution

    Our UUnet is once again routing traffic. The problem was tied to a Pittsburgh router that has since been repaired. Traffic should again be normal for all customers.
  • [Mar 28, 2000, 7:05 pm] UUnet Network Status

    At approxmiately 6:42PM EST this evening UUnet suffered an outage that brought down our connection with them. All traffic is currently being routed through our other providers, and they appear to be holding up. Customers may experience some slowdowns until this is resolved, at which point we will post a follow-up.
  • [Mar 27, 2000, 5:27 pm] pi downtime

    pi (www10) crashed under heavy load and was brought back online, after an extensive filesystem cleaning, with downtime around 20 minutes.
  • [Mar 27, 2000, 4:19 pm] UUnet Network Status

    UUnet has reportedly suffered a major fiber cut somewhere east of here. Although Pittsburgh was not partitioned from their network in this case, it did cause a routing flap and brief interruption of our UUnet service. It appears to be returning to normal at this time, although customers may still be directly affected if they are on UUnet's network in the affected area.
  • [Mar 27, 2000, 9:26 am] iota Downtime

    iota (www5) crashed under load and was brought back online after a filesystem check. Downtime was approximately 15 minutes.
  • [Mar 26, 2000, 9:26 am] gort Upgraded

    gort has been upgraded to a Pentium III at 600 MHz, with 256MB SDRAM. We expect to upgrade the other few remaining Pentium Pro servers within the next two weeks.
  • [Mar 26, 2000, 8:33 am] gamma Upgraded

    The drive upgrade on gamma has been completed. The server has also been upgraded to a Pentium III at 600 MHz, with 256MB SDRAM. We will be continuing these upgrades for all older servers over the coming weeks; an article will be posted shortly with more details.
  • [Mar 25, 2000, 8:26 am] gamma Emergency Maintenance

    gamma will be taken down briefly to install a new hard drive. Not only is it due for an upgrade, but one of the old drives is beginning to show errors. Total downtime should be less than fifteen minutes. The swap will continue while the server is online, and be completed with brief downtime again later today or tomorrow.
  • [Mar 23, 2000, 8:37 am] pi Downtime

    pi(www10) crashed under heavy load, and was rebooted. Downtime was about 35 minutes.
  • [Mar 22, 2000, 2:58 pm] Network Solutions Template Change

    Effective this week, we have begun using the newest e-mail template, version 6.0, that Network Solutions (NSI) provides for domain registrations and modifications. This template will be required by NSI effective April 1, 2000. We adopted it as soon as our testing was completed, primarily because it allows new domains to opt-out of NSI's new bulk marketing programs, under which they sell contact information to third parties without any other form of consent.

    We have selected the opt-out by default, as we expect this to be the preferred setting. If you prefer to be marketed to by third parties, please contact Network Solutions.

  • [Mar 22, 2000, 4:57 am] iota Downtime

    iota (www5) crashed under heavy load. It has since returned to normal service. Downtime was about 15 minutes.
  • [Mar 20, 2000, 10:37 pm] straif Downtime

    straif (www26) crashed due to load. It was brought back up in under 15 minutes.
  • [Mar 15, 2000, 3:59 pm] nuumen downtime

    nuumen crashed under heavy load, and was rebooted. Downtime was approximately 10 minutes.
  • [Mar 15, 2000, 2:19 pm] gamma downtime

    gamma crashed under heavy load, and required an extensive file system check upon reboot. Downtime was approximately 10 minutes.
  • [Mar 15, 2000, 12:12 am] SAVVIS Maintainence

    Connectivity through our SAVVIS circuit has returned at this time. The problem focused around their New York router, but after discussing the problem with SAVVIS support, they were unable to give us any other information as to the cause.
  • [Mar 14, 2000, 11:46 pm] SAVVIS Emergency Maintenance

    At around 11:07 pm EST , we started having trouble with our SAVVIS connection. SAVVIS support claimed that the outage was part of an Emergency Maintenance window of operation that was expect to last 5 minutes, but has since lasted almost 30. We will post more information as it becomes available.
  • [Mar 14, 2000, 2:46 pm] gort downtime

    gort crashed, requiring a reboot and file systems check. Downtime was approximately 10 minutes.
  • [Mar 14, 2000, 2:16 am] othala Downtime

    othala (www150) crashed and required a manual disk before rebooting. Downtime was 10 minutes.
  • [Mar 13, 2000, 12:02 am] ilwe Downtime

    ilwe (www81) was down this evening for approximately 10 minutes. It has since returned to normal operation.
  • [Mar 4, 2000, 6:45 pm] UUnet Resolution

    Just before 6pm Eastern time, our UUnet connectivity returned to normal status, for a total outage duration of nine hours, as predicted by UUnet technical support.

    All routing now appears to be normal; if there have been any other effects, we will post further information.

  • [Mar 4, 2000, 3:07 pm] hwesta Downtime

    hwesta (www51) was rebooted due to load issues. Downtime was 5 minutes.
  • [Mar 4, 2000, 11:58 am] UUnet Update

    UUnet reports that they have a total of ten OC-48 trunks out of service in Pennsylvania, completing isolating their Pittsburgh presence from the backbone. Due to the magnitude of the problem, no resolution is expected any sooner than 6pm Eastern time today, which would represent a remarkably long total outage time of at least nine hours.

    Customer traffic continues to flow normally through alternate providers with no disruption. We will post updated information when it becomes available.

  • [Mar 4, 2000, 9:06 am] UUnet Outage

    UUnet has lost all outbound connectivity in Pittsburgh, as of 8:45am Eastern time today. Although we can reach their gateway router directly over our circuit, it is isolated from the rest of their network. Luckily, the outage is severe enough that UUnet's internal routing has been updated, and all customer traffic is now comfortably flowing through SAVVIS, Sprint, and Digex. In fact, we can reach UUnet's network easily via SAVVIS, as they are connected to UUnet in New York City.

    We will post updated information as it becomes available from their NOC.

  • [Mar 3, 2000, 8:19 pm] theta Downtime

    theta (www4) was rebooted under severe load and required extensive filesystem checks and an additional reboot. It has now returned to normal service. Downtime was approximately 30 minutes.
  • [Mar 3, 2000, 1:10 pm] nwalme downtime

    nwalme (www57) crashed under heavy load and was brought back online with downtime around 15 minutes.
  • [Mar 3, 2000, 10:11 am] ailm Downtime

    ailm (www28) crashed and required a manual filesystem cleaning before being brought back online. Downtime was approximately 10 minutes.
  • [Mar 2, 2000, 2:48 pm] auma maintenance

    auma was taken down briefly to replace a failing network card. It has since returned to normal service, with approximately 10 minutes of downtime.
  • [Mar 2, 2000, 12:01 am] idad Downtime

    idad (www32) was down this evening for approximately 5 minutes. It has since returned to normal operation.
  • [Mar 1, 2000, 10:20 am] Denial of Service Attack

    Beginning at 9:48am Eastern time today, pair Networks was brought under a severe denial of service attack, with more than 100Mbps of traffic directed into its network from multiple attacking sources.

    Because a chain is only as strong as its weakest link, our network was affected; a portion was effectively knocked offline for eleven minutes. Without being more specific (to the benefit of attackers), we are pleased to report that the weakest link was previously scheduled to be replaced with an extremely powerful alternative, which is now on order. We will expedite this upgrade as much possible. We will also be accelerating the scheduled improvements for our gateway routers, which were originally planned to accommodate OC-3 circuits, but will also benefit in protecting against this type of attack.

    There is no absolute protection against denial of service attacks; the only cure is better defense of other servers and networks, and the only short-term fix is to have so much capacity that it can't be overwhelmed (currently this may not be possible; witness the attacks on Yahoo and eBay). Rest assured that we will do everything possible to defend our network and build it out to be able to survive any type of attack. For the record, we have sustained eight smaller attacks over the past two weeks, with no impact on performance or connectivity for any customer.

  • [Mar 1, 2000, 12:39 am] enda maintenance completed

    The previously mentioned drive upgrade for enda has been completed. Downtime was less than 10 minutes.
  • [Feb 28, 2000, 11:02 pm] falku Downtime

    falku (www98) crashed due to high load and was brought back online after a manual filesystem check. Downtime was approximately 10 minutes.
  • [Feb 25, 2000, 8:03 pm] naudhiz Downtime

    naudhiz (www141) suffered a brief outage from a crash caused by high server load. Downtime was 15 minutes.
  • [Feb 25, 2000, 2:24 pm] pi downtime

    pi crashed under heavy load, and required a filesystem check after rebooting. Downtime was approximately 10 minutes.
  • [Feb 24, 2000, 2:06 pm] arda downtime

    arda (www62) crashed under heavy load and was brought back online with downtime under ten minutes.
  • [Feb 23, 2000, 7:32 am] auma Downtime

    auma (www86) crashed under load and was brought back online after a filesystem cleaning. Downtime was approximately 10 minutes.
  • [Feb 23, 2000, 2:40 am] anca Downtime

    anca (www53) was down this evening for approximately 5 minutes. It has since returned to normal operation.
  • [Feb 19, 2000, 11:10 pm] bemnet Maintenance

    The swap on system memory for bemnet has been completed. We will continue to monitor the situation in case further hardware work is necessary.
  • [Feb 19, 2000, 9:55 pm] bemnet Maintenance

    We have investigated the recent erratic behavior of bemnet, including spontaneous reboots, and concluded that there is a problem with the system memory. bemnet will be taken down within the next 24 hours for approximately five minutes, in order to swap the system memory. If this does not correct the problem, a complete swap to a new motherboard and CPU will be required. We will post further information as it becomes available.
  • [Feb 18, 2000, 9:21 am] enda Maintenance

    enda (www80) was taken down briefly to prepare for a hard drive upgraded that had been postponed earlier. The upgrade is being done to increase available disk space. Downtime was approximately 10 minutes.
  • [Feb 15, 2000, 6:11 pm] vala maintenance completed

    vala (www59) is back up after the replacement of its ethernet card. Downtime was under five minutes.
  • [Feb 15, 2000, 3:32 pm] vala maintenance

    vala (www59) will be taken down at approximately 6:00 PM Eastern to replace a defective ethernet card. Downtime should be no more than five ten minutes.
  • [Feb 15, 2000, 11:25 am] ailm downtime

    ailm (www28) crashed under heavy load and was brought back online with downtime around 10 minutes.
  • [Feb 15, 2000, 1:57 am] SAVVIS Maintenance

    SAVVIS has just announced another emergency maintenance window to start sometime in the next 2 hours and last approximately 30 minutes. Customer traffic should be affected, but not significantly.
  • [Feb 14, 2000, 9:25 am] paat Downtime

    paat (www100) crashed under load and was brought back online following a manual filesystem cleaning. Downtime was approximately 15 minutes.
  • [Feb 11, 2000, 5:56 pm] bemnet downtime

    bemnet (www158) crashed under heavy load and was brought back online with downtime around 10 minutes.
  • [Feb 11, 2000, 12:14 pm] SAVVIS Emergency Maintenance

    After three more outages in SAVVIS New York City POP, they have just announced another emergency maintenance window for 12:45pm Eastern time, approximately 30 minutes from the time of this posting. The outage may take as long as 30 minutes. Customer traffic will be affected, but not dramatically.

    We would like to state publicly that we are pursuing termination of our service with SAVVIS, based on a poor service record, a lack of forthright information about problems, and an absence of future growth capacity. We expect to shift traffic towards Sprint and our new GTE circuit, once it is online.

  • [Feb 10, 2000, 1:37 pm] db2 Resolution

    The problems on db2 have been identified and corrected. The MySQL daemon on db2 has been stable for the past several hours, and we will continue to monitor it.
  • [Feb 10, 2000, 12:31 pm] SAVVIS Maintenance

    SAVVIS has advised us that there is a problem with a line card on the New York City router we are connected to, which likely explains the eleven brief outages we've experienced with them in the past week. They will be replacing that card on an emergency basis today or tonight, which will disrupt our connectivity through SAVVIS for fifteen minutes or less. Hopefully this will eliminate the problem in general.
  • [Feb 10, 2000, 10:29 am] db2 Problems

    db2 experienced intermittent problems with its MySQL daemon overnight, which have begun to recur this morning. We are currently working to correct the problem, as well as to relocate databases to other servers as part of our database upgrade plan.
  • [Feb 10, 2000, 4:23 am] Digex Resolution

    The problems with Digex are now cleared up. Traffic should again be normal for all users.
  • [Feb 10, 2000, 3:12 am] Digex Trouble

    We are currently experiencing problems with our leased line to Digex. We have been told by Digex that they are aware of the problem and are working on it. We will keep on top of this and post more information as it is available.
  • [Feb 9, 2000, 3:36 pm] MySQL Upgrades

    The daemons on all MySQL database servers were upgraded this afternoon to install a security patch. Downtime for each daemon was less than 2 minutes.
  • [Feb 9, 2000, 3:52 am] quan Maintenance

    quan will be shut down briefly at approximately 4:05am to have its ethernet card replaced due to hardware errors observed with the current one. Downtime should be less than 10 minutes.
  • [Feb 9, 2000, 1:08 am] Network Troubles

    Due to the Internet-wide backlash from the recent Distributed Denial of Service attacks against several major websites, the internet at large is seeing massive amounts of congestion and poor routing. This may cause some customers to see poor performance when reaching their sites hosted at pair, or even not be able to reach them at all in some extreme circumstances. We are working with our upstream providers to try and restore things to normal as quickly as possible, at least from our point of view. Unfortunately as this is a problem affecting the entire Internet, it is mostly out of our hands. We'll post more information as it becomes available.
  • [Feb 8, 2000, 4:10 am] UUnet Brownout

    At 3am Eastern time, we observed a sharp decrease in traffic to and from UUnet. The traffic shifted to SAVVIS and Sprint, which implies significant changes in BGP route announcements. At the same time, UUnet's Pittsburgh gateway became almost entirely non-responsive to anything but routed traffic. The possible causes include a configuration error at UUnet's Pittsburgh gateway, or major outages elsewhere in UUnet's network, thereby affecting route distribution.

    We did not observe any such outages, and UUnet's first explanation was, incredibly, that the problem was on our end and was not in their network. By the time we received a second explanation an hour later, the network routing had suddenly returned to normal. That explanation was that they are doing extensive maintenance in East Coast POPs, and that we should expect to see intermittent problems until 6am.

    Of course, we receive daily notification from UUnet about planned maintenance everywhere within their global network, from software upgrades in Saint Louis to modem swaps in Melbourne, and for at least the last two weeks, no such round of maintenance has been planned. The announcement for today's maintenance, in fact, lists only Stockholm, Sweden. Of course, that announcement has been the same three days in a row, so perhaps it is erroneous.

    Customer traffic was not significantly affected, although our faith in the forthrightness and reliability of one of our upstreams has been shaken. We are honest and direct with our customers, and truly wish that our providers would do the same for us.

  • [Feb 7, 2000, 12:44 am] SAVVIS Trouble

    Once again we are having trouble with our DS-3 to SAVVIS. The problem appears to be similar in nature to what we experienced last Tuesday (2/1), though so far there have only been 2 brief 5-10 minute outages. Customer traffic has not been adversely affected by these outages, as we have plenty of extra capacity online now with our other providers. We are working with SAVVIS to resolve the problem, and will post further details as they become available.
  • [Feb 4, 2000, 5:31 pm] Internal Routing Brownout

    The internal routing maintenance announced for last night was considered the difficult portion of some network changes we're making. Those changes went smoothly and caused no disruptions of traffic. The portion considered "easy", however, took place around 4:30pm today, and did disrupt traffic destined to approximately two dozen servers for approximately ten minutes.

    The changes are intended to improve the overall stability of the internal network; we are also evaluating a possible network upgrade that would improve our internal routing capacity by a factor of several hundred, while also simplifying our network design. We do not expect any further disruptions due to configuration changes or unreliable Cisco features. We will post details of the proposed network upgrade as soon as we have completed our evaluation of the required technology.

    We apologize for the interruption of traffic to affected servers and customer sites.

  • [Feb 3, 2000, 8:14 pm] Network Maintenance Complete

    The scheduled network maintenance is complete, with no problems. As expected, customer traffic was not affected.
  • [Feb 3, 2000, 3:57 pm] uilen downtime

    uilen (www35) crashed under heavy load and was brought back online with downtime around 15 minutes.
  • [Feb 3, 2000, 3:21 pm] Network Maintenance

    Starting at approximately 8pm tonight we will be briefly taking down each of our internal routers for maintenance. Customer traffic should not be adversely affected. Each router's downtime will be approximately 5 minutes.
  • [Feb 3, 2000, 2:09 pm] Problems with Network Solutions

    We feel obligated to warn our customers that we have consistently been having problems with response time and correctness when dealing with Network Solutions in recent weeks. We are frequently given incorrect information to rely on and pass along to our customers. This includes issues such as what e-mail addresses to send templates to, how templates should be filled out, and even (most recently) which template should be used.

    Competition has been introduced for domain registrations under the generic top-level domains of .COM, .NET, and .ORG. Network Solutions operates a central registry, separate from its operations as a registrar, and that registry deals with competing registrars, accredited by ICANN, to accept domain registrations. Many of the competing registrars have different procedures, behaviors, and pricing, when compared to Network Solutions' registrar business.

    At the present time, our systems do not directly support these alternate registrars. Also, our Gold Premier status with Network Solutions is important to us, as it allows our customers to register domains with deferred payment, and provides us with a contact channel through which we can at least attempt to rectify some of the problems mentioned above. We do, however, feel that the current situation is unsuitable, and that competition in domain registration is essential. Consequently, we have taken several steps. First, we have received our own accreditation from ICANN, and expect to be handling domain registration directly as soon as April of this year. Second, our pair2000 software system, due for user debut on March 15, will include support for one or more competing registrars.

    In the meantime, we will continue to assist customers who encounter problems using Network Solutions' registrar services. We have been advised that Template 5.0 is our best choice, although suddenly only Template 6.0 is accepted. We hope to have this resolved today.

    Customers who wish to use alternate registrars will need to handle those registrations directly with such registrars, and present the domains to us as "transfers from other NIC". If you receive a template designed for use with Network Solutions but do not use them as your registrar, you may simply copy the nameserver information from that template.

    We look forward to improving this situation and eliminating unpleasant experiences with domain registration.

  • [Feb 2, 2000, 9:53 pm] upsilon Downtime

    upsilon (www13) crashed due to heavy load. Downtime was 10 minutes.
  • [Feb 1, 2000, 6:49 pm] SAVVIS Network Trouble

    Our SAVVIS line is once again experiencing problems and is currently down. We are on the phone right now with their technicians to attempt resolve this.

    In the meantime, our Digex line is now functional, and this is doing a good job of helping to balance the loss. As a result, most users should not see significant problems in reaching us.

    We are, nevertheless, working right now to bring our SAVVIS line back up and will post another notice once we have more to report.

  • [Feb 1, 2000, 5:58 pm] SAVVIS Update

    The most recent SAVVIS outage has continued for fifteen minutes, and appears to be a more serious problem than originally realized. We are attempting to get further information from SAVVIS.
  • [Feb 1, 2000, 5:50 pm] SAVVIS Network Trouble

    We are seeing trouble with SAVVIS today; after two brief outages during their overnight maintenance window, their New York connectivity has gone offline entirely twice today. Their explanation so far is that they are still having problems with their New York backbone connectivity, similar to the incident on January 27. The outages have been brief and have had minimal effect on customer connectivity. We will post further information if it becomes available.
  • [Jan 31, 2000, 10:34 am] db4 Reboot

    db4 was rebooted to clear a resource-usage problem -- it was brought back online within 10 minutes.
  • [Jan 30, 2000, 6:53 am] Network Problems

    Beginning around 4:30am Eastern time today, one of our internal routers experienced severe instability due to a low memory condition. The memory was exhausted by a routing problem elsewhere on the Internet. Unfortunately, the failure mode caused the router to "flap" (go offline and online again repeatedly) rapidly, which prevented other routers from successfully taking over its traffic.

    Within an hour, we had returned routing to normal. We have reconfigured the affected router so that it carries less traffic and uses memory more efficiently. We will be upgrading memory on all of our internal routers to their maximum physical capacity this week, in order to reduce the possibility of any similar failure in the future. We also have additional routing equipment already on order to handle anticipated growth in demand.

    Approximately one-third of customer sites experienced a network "brownout" during the event. We apologize for the inconvenience, and are taking every step possible to ensure the problem does not recur.

  • [Jan 27, 2000, 1:19 pm] Network troubles

    A sharp drop in traffic to and from our network alerted us of an incident in progress. So far, we have heard of a major fiber cut between here and New York which may be the cause. This caused some network instability in the overall region while it was worked around, and hampered our connectivity for approximately 20 minutes. Currently, we appear to have full service again. We will post additional details as more information becomes available.
  • [Jan 26, 2000, 1:30 pm] chi downtime

    chi (www14) became unresponsive after suffering high load due to run away CGI processes. It came back after a clean reboot and file system check, and downtime was less than 15 minutes.
  • [Jan 25, 2000, 11:07 pm] SAVVIS Emergency Maintenance

    SAVVIS has announced a major emergency maintenance window for this evening, spanning 10pm to 3am Eastern time. Their connectivity in St Louis, New York City, and Chicago will be severely reduced for multiple periods of approximately 20 minutes each. Customer traffic on these routes may be significantly degraded during that time. We are connected via New York City.

    We apologize for the short notice; we have only been notified by SAVVIS within the last hour. We did observe performance problems in SAVVIS' network today, but compensated by rerouting some traffic manually. We hope that the maintenance will eliminate those problems and allow us to fully utilize our SAVVIS connectivity once again.

  • [Jan 25, 2000, 3:26 am] coll Downtime

    coll (www22) become unresponsive due to a lack of system resources and needed to be rebooted. Downtime was less than 15 minutes.
  • [Jan 24, 2000, 8:53 am] Emergency Maintenance

    We are currently rebooting all user servers, in order to address two recently-discovered security weaknesses. Downtime for each server will be less than five minutes, and there will be no other effect on user functionality.
  • [Jan 24, 2000, 4:50 am] UUNet Maintenance

    UUNet will be performing maintenance on their Pittsburgh router during their maintenance window on the morning of Tuesday 1/25 beginning around 3:00am EST. Customer traffic will continue to flow on our other circuits and should not be adversely affected.
  • [Jan 22, 2000, 9:16 pm] uilen Downtime

    uilen (www35) crashed due to heavy load. Downtime was 30 minutes.
  • [Jan 21, 2000, 6:37 pm] theta Downtime

    theta (www4) rebooted under heavy load. Downtime was around 15 minutes.
  • [Jan 21, 2000, 1:35 pm] UUnet Routing

    While performing some emergency rebalancing of traffic this afternoon around 1pm, UUnet traffic was impaired for approximately three minutes. We have corrected the problem, and have now shifted more traffic towards UUnet. Our bandwidth expansion plans now consist of bringing Digex online within the next two weeks, having GTE online by the end of February, and activating our UUnet OC-3c in early March.
  • [Jan 20, 2000, 9:17 am] PSInet Problems

    PSInet is having major peering problems with Sprint; this was noticed yesterday but has resumed during periods of high traffic today. Sprint is working with them on the issue; hopefully they will upgrade their private peering to accommodate the level of traffic.

    The impact on customers is that interactive response from Earthlink or other PSInet-connected ISPs may be poor at certain times of the day. We will attempt to redirect our outbound traffic through another backbone within the next 24 hours, but we cannot affect the congestion on the inbound route.

  • [Jan 19, 2000, 12:39 pm] enda Upgrade

    enda (www80) will be taken down briefly this afternoon to begin the process of replacing the hard drive. Another brief period of downtime will be required when the swap is completed.
  • [Jan 19, 2000, 10:27 am] db4 MySQL Downtime

    The MySQL daemon was down for approximately 20 minutes this morning after a failure occurred in the automated process for monitoring it. The problem was caught by external monitoring and corrected.
  • [Jan 18, 2000, 1:41 pm] gamma Downtime

    gamma rebooted under high load, and after some extensive file system checks came back with no lingering effects. Downtime was approximatly 10 minutes.
  • [Jan 18, 2000, 11:48 am] uilen downtime

    uilen (www35) crashed under heavy load and was brought back online with downtime under 15 minutes.
  • [Jan 17, 2000, 8:56 pm] Omicron Downtime

    Omicron (www9) rebooted under severe load; it has been returned to service after a normal reboot.
  • [Jan 14, 2000, 8:33 pm] ruis Downtime

    ruis (www27) died under load. Downtime was less than 15 minutes.
  • [Jan 14, 2000, 5:29 am] pi Downtime

    Around 4:15am, pi crashed, apparently under heavy load. However, it refused to come back online. What was originally believed to be a hard drive problem was traced to faulty memory. After replacing the SDRAM on pi, it has booted without problems. The total downtime was approximately 60 minutes.
  • [Jan 12, 2000, 12:04 am] hwesta Downtime

    hwesta (www51) crashed under heavy load. Downtime was less than 10 minutes.
  • [Jan 11, 2000, 4:04 am] UUnet Maintenance

    UUnet has completed their maintenance and traffic is flowing normally again.
  • [Jan 11, 2000, 3:38 am] UUnet Maintenance

    UUnet's scheduled maintenance is taking longer than expected; it seems that they have run into difficulties. As of 3:35am Eastern time, our connectivity to UUnet has been offline for 30 minutes, and UUnet does not expect to restore that connectivity for at least another 30 minutes. Customer traffic is flowing normally through alternate paths, and is not being significantly affected.
  • [Jan 9, 2000, 2:39 pm] UUnet Connectivity

    At approximately 2:15pm Eastern time today, we lost connectivity with UUnet. The link came back up within a few minutes, but it took another fifteen minutes for UUnet to reestablish connectivity between Pittsburgh and the rest of their network. Hopefully this is not going to be a recurring problem in their POP; if so, they will likely take it down for hardware replacement.

    Customer traffic was delayed by the brownout; a clean outage would have been smoother. All is now returning to normal.

  • [Jan 8, 2000, 2:33 pm] vala Reboot

    vala (www59) was rebooted to alleviate a low-memory condition under heavy load. It has returned to regular service at this time.
  • [Jan 7, 2000, 6:57 pm] Maintenance Windows

    Our upstream providers have announced plans to utilize their maintenance windows as follows: UUnet Jan 11 Tuesday morning; Digex Jan 12 Wednesday morning; Digex Jan 18 Tuesday morning. None of these outages are expected to be long-term, and none should affect customer traffic significantly.
  • [Jan 6, 2000, 9:26 pm] gebo Downtime

    gebo (www139) froze. Downtime was less that 10 minutes.
  • [Jan 4, 2000, 10:02 am] vala Reboot

    vala (www59) was rebooted this morning to clear up a resource allocation problem. Downtime was less than 5 minutes, and the server has returned to normal service.
  • [Jan 3, 2000, 3:59 pm] vala Downtime

    vala (www59) was down for approximately 10 minutes this afternoon. It has since returned to normal service.
  • [Jan 1, 2000, 1:34 pm] zatz Drive Swap

    zatz has had its primary drive swapped for a larger model, with two brief downtimes of approximately five minutes each. The original drive was reporting errors.
  • [Jan 1, 2000, 1:02 am] Y2K Rollover

    There are no Y2K-related incidents to report with regards to pair Networks servers, network, or other operations. As of 1am Eastern time, all operations are normal.

    There is an unrelated prefailure indication on the hard drive for zatz; it will be replaced during normal hours on Saturday, January 1st, 2000.

    Happy New Year's!

    View Notices for 1999



  •  
    » Support Home

    » System Notices

    » Network Status
    » Server Status

    » Site Search