Eric Journal

Cosmos, Eric Index, Universe

2006 Jan 30

Jan 28 05:08:29 eric unix: WARNING: [AFT1] EDP event on CPU0 Data access at TL=0, errID 0x0014cf26.3d155b76
Jan 28 05:08:29 eric     AFSR 0x00000000.00404000<EDP> AFAR 0x00000002.de45a4c0
Jan 28 05:08:29 eric     AFSR.PSYND 0x4000(Score 95) AFSR.ETS 0x00 Fault_PC 0x100211860
Jan 28 05:08:29 eric     UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000 UDBL.ESYND 0x00
Jan 28 05:08:29 eric unix: [AFT2] errID 0x0014cf26.3d155b76 PA=0x00000002.de45a4c0
Jan 28 05:08:29 eric     E$tag 0x00000000.09c05bc8 E$State: Modified E$parity 0x04
Jan 28 05:08:29 eric unix: [AFT2] E$Data (0x00): 0x3fa57bfc.eda65795 *Bad* PSYND=0x4000
Jan 28 05:08:29 eric unix: [AFT2] E$Data (0x08): 0x3fa77bfc.eda65795
Jan 28 05:08:29 eric unix: [AFT2] E$Data (0x10): 0x3fa77bfc.eda65795
Jan 28 05:08:29 eric unix: [AFT2] E$Data (0x18): 0x3fa77bfc.eda65795
Jan 28 05:08:29 eric unix: [AFT2] E$Data (0x20): 0x3fa77bfc.eda65795
Jan 28 05:08:29 eric unix: [AFT2] E$Data (0x28): 0x3fa77bfc.eda65795
Jan 28 05:08:29 eric unix: [AFT2] E$Data (0x30): 0x3fa77bfc.eda65795
Jan 28 05:08:29 eric unix: [AFT2] E$Data (0x38): 0x3fa77bfc.eda65795
Jan 28 05:08:29 eric unix: [AFT2] errID 0x0014cf26.3d155b76 AFAR was derived from E$Tag
Jan 28 05:08:29 eric unix: NOTICE: Scheduling clearing of error on page 0x00000002.de45a000
Jan 28 05:08:29 eric unix: [AFT3] errID 0x0014cf26.3d155b76 Above Error is in User Mode
Jan 28 05:08:29 eric     and is fatal: will reboot
Jan 28 05:08:29 eric unix: WARNING: [AFT1] initiating reboot due to above error in pid 23908 (l502.exe)
Jan 28 05:08:41 eric unix: NOTICE: Previously reported error on page 0x00000002.de45a000 cleared
Jan 28 05:08:43 eric unix: pseudo-device: pm0
Jan 28 05:08:43 eric unix: pm0 is /pseudo/pm@0
Jan 28 05:08:44 eric syslogd: going down on signal 15
Jan 28 05:09:30 eric unix: syncing file systems...
Jan 28 05:09:32 eric unix:  done
Jan 28 05:16:15 eric unix: ^MSunOS Release 5.7 Version Generic_106541-42 64-bit [UNIX(R) System V Release 4.0]
Jan 28

2005 May 06: Disk Failed

 -- call no 3737 6109


2003 July 15 : Eric No Talk --- Hardware Problem?

On the evening on 14th Eric just suddenly stopped talking to me --- and no login from anywhere was possible. On 15th tried the console --- no luck. Got the ok prompt though and syncced disks, got:

    dumping to c0t0d0s1
    Interrupt bitset after 10 seconds card/firmware failure

    [repeated four times]

    Fast Data Access MMU Miss
The booted at the ok prompt and got:
    WARNING forceload of misc/md_trans failure
    WARNING forceload of misc/md_raid failure
    WARNING forceload of misc/md_hostspares failure
All seems ok though...


2003 July : NQS reinstalled/configured

...to be a local-queue-only system (no old Galaxy) --- see the separate document.


2003 May : /etc/inetd.conf

Made some changes to /etc/inetd.conf so that at next boot in.talkd, in.fingerd and in.uucpd will be blocked at the service/inetd level (in addition to at IP Filter level and router level).

For details see Security Journal.


2003 May 22: LSF

Removed LSF (pkgrm SUNWlsf) from Eric as the license has expired and we need the space (/opt is part of /).


2003 Feb 17: t2 and t3 disks

Formatted the second new "scratch" disk, c2t2d0s6 and mounted on /export/simonh for want of something more appropriate for now.

Formatted the "spare" disks, c0t3 and c2t3 and mounted on /export/little_star* (Twinkle, twinkle, little star, how I wonder what you are...).

Editted /etc/vfstab to ensure all gets remounted at next boot.


2003 Feb 17: SUNWhpc

Finally managed to get SUNWhpc installed and running on Eric with Ian. It would not install and work! Sun's support people were worse than useless, suggesting the problem was out LDAP authentication which we proved was not the case after an install on mir.csu (Simon's Solaris7/openldap machine).

Files:

  /opt/SUNWhpc
  /etc/init/sunhpc.*
and associated links in /etc/.

For reasons best known to itself SUNWhpc would not install



2003 Feb 12: New /scratch Disks

Replaced the two 9Gb disks concatenated into /scratch (c0t2d0 and c2t2d0s0) with two 36Gb disks.

/scratch would not umount so determined which processes were using the slice using ~mpciish2/bin/lsof:

    lsof | grep scratch | awk -F" " print... | sort | uniq
and killed said processes then umount was successful. Swapped the disks and checked partition table with format and finally used newfs and mounted.

Edited /etc/opt/SUNWmd/md.cf and md.tab to reflect changes.

Edited /etc/vfstab to reflect changes.

Files:

  /usr/opt/SUNWmd/*
  /etc/opt/SUNWmd/*


Tidy up of old VIP Eric Accounts

MechEng

Given email from John Chinn to this effect, have determined which appear to be dead mecheng accounts and have tarred, gzipped them and moved to /export/umist/simonh* and deleted entries from /etc/passwd, shadow and auto_home.

Civil

Based on the names given in the entry for 02/11/22 in this journal, am tarring and gzipping home-dirs of said civil engineers to /export/umist/simonh* and removing the home-dirs themselves. Have deleted entries from /etc/passwd etc.


2003 January : tcpdump

In stalled tcpdump from sunfreeware binary to investigate problems. It complained that libcrypt0.so.??? was missing. Google showed that this was part of openssh/openssl. Investigation of the OpenSSH and OpenSSL binaries on sunfreeware.com contained said library and that it was already installed on Cosmos (but not Eric) --- the OpenSSL package, which installs itself in /usr/local/ssl contained it, so adding /usr/local/ssl/lib to LD_LIBRARY_PATH solved the problem and had a good play with tcpdump.


2003 January : syslogd

Edited /etc/syslog.conf on and restarted (kill -HUP) syslogd on both Cosmos and Eric so logs are now copied to Gresh's logserver. Added this

    # gresh's log server :
    *.info						@130.88.???.???
to /etc/syslogd.conf. N.B. that whitespace consists of tabs, not blankspaces --- use the wrong one and an error message appears in /var/adm/messages (cf.
    eric syslogd: line 44: unknown priority name "info        @130.88.120.194"
which clearly shows that syslogd is getting its knickers in a twist...


2002 Dec 10: Removal of Dead (VIP) Chemistry Accounts

Accounts listed below were removed. Also, chemistry accounts in /export/umist were moved to /export/chem.

Envelope-to: mpciish2@cosmos.umist.ac.uk
Delivery-date: Mon, 09 Dec 2002 16:09:06 +0000
From: "Steven Y Liem" <y.liem@umist.ac.uk>
To: <simon.hood@umist.ac.uk>
Subject: RE: eric accounts
Date: Mon, 9 Dec 2002 16:08:59 -0000
Content-Type: text/plain;
	charset="us-ascii"
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
In-reply-to: <E18H1i4-0003NS-00@cosmos.umist.ac.uk>
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106
Importance: Normal
X-Scanner: exiscan for exim4 (http://duncanthrax.net/exiscan/) *18LQSr-0007Fo-00*dAjrC1N2GiQ*
X-Scanner: exiscan for exim4 (http://duncanthrax.net/exiscan/) *18LQSv-0005kG-00*EcGy3ra7Jxw*

Hi Simon,

The following accounts can be savely removed:

    Mcdas01, mcdas02, mcdst03, mcdst04, mcdap01, mcdst00, mcdsskl,
mcdssmpi,
    mcdssjc, mcdsslvw, mcdigfa2.


Cheers

Steven


2002 Dec 02: Got IP Filter Properly Configured

See this for details.


2002 Dec 02: rexec

Shutdown access to Eric via in.rexecd by commenting out in.rexecd entries in /etc/hosts.allow (have default-deny in /etc/hosts.deny). Sent reminder email to same people as below.


2002 Nov 26: Email re rexec


Envelope-to: mpciish2@cosmos.umist.ac.uk
Delivery-date: Tue, 26 Nov 2002 14:26:27 +0000
From: Dr Simon Hood <mpciish2@galaxy.umist.ac.uk>
To: patrick.o'malley@umist.ac.uk,
 j.clarke@umist.ac.uk,
 yun.liem@umist.ac.uk,
 paul.popelier@umist.ac.uk,
 Les.Woodcock@umist.ac.uk,
 C.Yulian@umist.ac.uk,
 M.Aleyaasin@umist.ac.uk,
 s.burley@umist.ac.uk,
 pj.tan@umist.ac.uk,
 zhenmin.zou@umist.ac.uk,
 john.j.harrigan@umist.ac.uk,
 michael.burdekin@umist.ac.uk,
 y.tkach@umist.ac.uk,
 john.chinn@umist.ac.uk,
 Simon.Hood@umist.ac.uk,
 a.g.nasser@umist.ac.uk,
 ian.j.rosindale@umist.ac.uk,
 z.bakker@umist.ac.uk,
Subject: rexec, telent and ssh to eric
Reply-to: simon.hood@umist.ac.uk
Date: Tue, 26 Nov 2002 14:26:16 +0000
X-Scanner: exiscan for exim4 (http://duncanthrax.net/exiscan/) *18GgfI-0005cj-00*o95zl2UGVZ6*
X-Scanner: exiscan for exim4 (http://duncanthrax.net/exiscan/) *18GgfT-0000cw-00*jDz8szPBPIM*


Hi All,

over the coming weeks and months security on Eric will be tightened 
significantly.  Main changes:

 1. The r-commands will we disabled.  This means that eXceed users must
    change their configuration to use a method other than rexec (or rsh,
    rlogin).  For the present telnet is the simplest option to use.  

 2. telnet and ftp access will soon be disabled from outside .umist.ac.uk 
    --- ssh and scp are available instead and are more secure.  If you have 
    questions regarding these please email me.  (Note that you will need 
    a recent ssh client, and this must be configured to use 
    keyboard-interactive authentication.)

Please reconfigure your eXceed installation so that it does not use an
r-command!

Regards

Dr Simon Hood, ISD.


2002 Nov 26: Security Plan for Cosmos and Eric

First steps of plan developed to get IP Filter on Cosmos and Eric with default-deny; also replace telnet and ftp with ssh and scp. For details of the evolving plans see this.


2002 November 22


Envelope-to: mpciish2@cosmos.umist.ac.uk
Delivery-date: Fri, 22 Nov 2002 11:35:07 +0000
From: "Steven Y Liem" <y.liem@umist.ac.uk>
To: <simon.hood@umist.ac.uk>
Subject: RE: eric dead accounts
Date: Fri, 22 Nov 2002 11:34:57 -0000
Content-Type: text/plain;
	charset="us-ascii"
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
In-Reply-To: <E18FBxZ-0002Y1-00@cosmos.umist.ac.uk>
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106
Importance: Normal
X-Scanner: exiscan for exim4 (http://duncanthrax.net/exiscan/) *18FC5I-0004bm-00*Ea/fFyUQPIg*
X-Scanner: exiscan for exim4 (http://duncanthrax.net/exiscan/) *18FC5T-000319-00*YKbOIHJQBHI*

Hi Simon,

I think the following accounts can be safely removed:

Mcdas01, mcdas02, mcdst03, mcdst04, mcdap01, mcdst00, mcdsskl, mcdssmpi,
mcdssjc, mcdsslvw, mcdigfa2.

The last 5 accounts should be backed up before removing.

Regards,

STeven


02/11/22

Based on info from Prof Burdekin began process or backing up and deleting the following accounts from Eric:

    Prof Burdekin said...  

    "Of the persons listed in your e-mail in my group the following have left 
     UMIST and can be deleted from future use - if they have any files left on 
     the system please let me know as these must be backed up."

    So...

                                   kbytes

    mcgidkpk:  /export/umist --> civil,       569816,  K.P. Kou, 
    mcgihkk4:  /export/umistmisc --> civil,  3024849,  K. Kuntiyawichai, 
    mcgijis2:  /export/civil1 --> civil,      712342,  I. Sbokos, 
    mcgizsu2:  /export/civil,                      6,  S. Ucsnik, 
    mcgiztd2:  /export/civil,                     32,  T. Dohr, 
    mcgsswz2:  /export/civil,                9065021,  W. Zhao.
Moved accounts all to /export/civil and disabled each via hacking entry in /etc/passwd.


02/11/20 and 02/11/21

Copied /var/yp from Cosmos to Eric and adjusted as seemed appropriate. Added some sensible contents to /var/yp/ypfiles/passwd and auto_home and security/passwd.

Started NIS/YP on Eric by calling

    /lib/netsvc/yp/ypstart
and we got
    starting NIS (YP server) services: ypserv ypbind ypxfrd \
        rpc.yppasswdd rpc.ypupdated done.
which resulted in
     /usr/ucb/ps auxww | grep yp

    root  ...  /usr/lib/netsvc/yp/ypserv -d
    root  ...  /usr/lib/netsvc/yp/ypbind
    root  ...  /usr/lib/netsvc/yp/ypxfrd
    root  ...  /usr/lib/netsvc/yp/rpc.yppasswdd -D /var/yp/ypfiles -m
    root  ...  /usr/lib/netsvc/yp/rpc.ypupdated
which is the same as Cosmos so looks well.

/var/yp/Makefile failed as ypservers: no such map or words to that effect. A quick look on google added a ypservers map thus:

    cd /var/yp
    echo eric eric | makedbm - tmpmap
    mv tmpmap.dir yp.eric.umist/ypservers.dir
    tmpmap.pag yp.eric.umist/ypservers.pag
then
    [root@eric yp]# ypcat ypservers
    eric
    [root@eric yp]# touch ypfiles/auto_home ypfiles/passwd \
        ypfiles/security/passwd.adjunct
    [root@eric yp]# /usr/ccs/bin/make
    updated passwd
    ...
Good! Removed all evidence of mpciish2 from /etc/passwd, shadow and auto_home and found could not login as mpciish2. Added nis to passwd entry in /etc/nsswitch.conf and all is well --- can log in as mpciish2.


02/11/15


Envelope-to: mpciish2@cosmos.umist.ac.uk
Delivery-date: Fri, 15 Nov 2002 12:48:26 +0000
From: "Steven Y Liem" <y.liem@umist.ac.uk>
To: <simon.hood@umist.ac.uk>
Subject: RE: eric accounts
Date: Fri, 15 Nov 2002 12:48:11 -0000
Content-Type: text/plain;
	charset="us-ascii"
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106
In-Reply-To: <E18BwzY-0001Iy-00@cosmos.umist.ac.uk>
X-Scanner: exiscan for exim4 (http://duncanthrax.net/exiscan/) *18CftQ-0000OA-00*avBW9OFXohA*
X-Scanner: exiscan for exim4 (http://duncanthrax.net/exiscan/) *18Cfta-0007VF-00*6RjdqRzU3iI*

Hi Simon,

This is to confirm what we discussed on the phone, that you can remove
mcdsswkd from eric's disk.

Regards,

Steven


02/11/15

Renamed /var/nis/NIS_COLD_START so that /etc/init.d/rpc does not start NIS+ on an Eric (re)boot.


02/11/15

Problem with mounting shared (exported) eric filesystems on clients (which do have permission to do this). Original entry in /etc/nsswitch.conf for RPC was nisplus [NOTFOUND=return] files, now just files, seemed likely, having taken out NIS+ yesterday, that needed to restart RPC and/or nfsd and related stuff. Did so; problem solved. Details follow...

Initial problem:

   mount eric:/export/mecheng/<username> /home/<username>
   nfs mount: eric:/export/mecheng/mcjifmta: server not responding : 
       RPC: Timed out
   nfs mount: retrying: /home/<username>
Restarted /usr/sbin/rpcbind on server (eric) and then got
mount eric:/export/mecheng/mcjifmta /home/mcjifmta
nfs mount: eric: : RPC: Program not registered
nfs mount: retrying: /home/mcjifmta
on a Solaris client and
    mount eric.umist.ac.uk:/export/umist/isd/mpciish2 /mnt/eric
    mount: RPC: Unable to receive; errno = Connection refused
on a Linux client. Google suggested killing nfsd, restarting mountd and then starting nfsd again, so given
    /usr/ucb/ps auxww | grep -i nfs
    root       252  ...  /usr/lib/nfs/lockd
    daemon     253  ...  /usr/lib/nfs/statd
    root       504  ...  /usr/lib/nfs/mountd
    root       506  ...  /usr/lib/nfs/nfsd -a 16
did
    kill 506
    kill -HUP 504
...oh shit, mountd disappeared, so
    /usr/lib/nfs/mountd 
    /usr/lib/nfs/nfsd -a 16
    /usr/ucb/ps auxww | grep -i nfs
leaving
    root   ...  /usr/lib/nfs/mountd
    root   ...  /usr/lib/nfs/nfsd -a 16
    root   ...  /usr/lib/nfs/lockd
    daemon ...  /usr/lib/nfs/statd
All sorted it now appears.


02/11/14 : Switched off NIS+

Copied contents of passwd.org_dir (NIS+ map) to /etc/passwd and put suitable place-holders in /etc/shadow. Ensured contents of /etc/nsswitch.conf had nisplus entries removed (commented out). Then studied contents of /etc/init.d/rpc and found that needed to stop rpc.nisd, nis_cachemgr and rpc.nispasswdd. Did so. Seemed ok.


02/11 : Info Re Dead Accounts


Envelope-to: mpciish2@cosmos.umist.ac.uk
Delivery-date: Fri, 08 Nov 2002 17:07:33 +0000
From: "Steven Y Liem" <y.liem@umist.ac.uk>
To: <simon.hood@umist.ac.uk>
Subject: RE: eric disk space
Date: Fri, 8 Nov 2002 17:07:30 -0000
Content-Type: text/plain;
	charset="us-ascii"
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106
Importance: Normal
In-Reply-To: <E189jeD-0007Kf-00@cosmos.umist.ac.uk>
X-Scanner: exiscan for exim4 (http://duncanthrax.net/exiscan/) *18ACbP-0004gC-00*xjYikpUGu6A*
X-Scanner: exiscan for exim4 (http://duncanthrax.net/exiscan/) *18ACbU-0008T2-00*JYZ6OPim4U.*

Hi Simon,

I believe the following users should be retained on eric:

mcdapyl
mcdap00
Mcdapjc
Mcdijuc2
Mcdi7ps4
Mcdi7bn2
Mcdaspm
mcdstpp
mcdst00

I am still waiting for some of the supervisors to confirm whether it is
OK to remove other usernames.

Also, bsub/bjobs doesn't seem to work anymore. I have a project student
who requires it to run mpi jobs. 

By the way, when is eric going to be reconfigured and will it have a
dedicated queue for parallel jobs ?

Regards,

Steven


Envelope-to: mpciish2@cosmos.umist.ac.uk
Delivery-date: Tue, 12 Nov 2002 13:30:05 +0000
From: "Professor F.M.Burdekin" <michael.burdekin@umist.ac.uk>
To: simon.hood@umist.ac.uk
Date: Tue, 12 Nov 2002 13:29:56 -0000
Content-type: text/plain; charset=US-ASCII
Subject: Re: eric reconfiguration
Reply-to: michael.burdekin@umist.ac.uk
CC: simon.hood@umist.ac.uk
X-Confirm-Reading-To: michael.burdekin@umist.ac.uk
X-pmrqc: 1
Priority: normal
In-reply-to: <E18BZ3C-0003vB-00@cosmos.umist.ac.uk>
X-Scanner: exiscan for exim4 (http://duncanthrax.net/exiscan/) *18Bb79-0001d7-00*4oJ8jBROQk.*
X-Scanner: exiscan for exim4 (http://duncanthrax.net/exiscan/) *18Bb7F-0005pZ-00*NTRV4TzZ5OM*

Dear Dr Hood,

Of the persons listed in your e-mail in my group the following have left 
UMIST and can be deleted from future use - if they have any files left on 
the system please let me know as these must be backed up.
K.P. Kou, K. Kuntiyawichai, I. Sbokos, S. Ucsnik, T. Dohr, W. Zhao.

The following persons are still at UMIST and must be given continued 
access:
F.M. Burdekin, K.C. Leong, Y. Tkach, E. Aja de Retana,  N. Cunliffe, E. El 
Dardiry.

Please note that I purchased an additional disk specifically for the use of 
my group, and E aja de Retana in particular, and that disk is not to be made 
available to persons outside my group.
Regards,


Michael Burdekin

Professor F.M. Burdekin
Tel No. 0161-200 4600
Fax No. 0161-200 4601


02/11/13: Reconfigured ssh authentication and sshd on Eric

For authentication added these lines to /etc/pam.conf:

    sshd2   auth sufficient /usr/lib/security/pam_unix.so.1
    sshd2   auth required   /usr/lib/security/pam_ldap.so.1 try_first_pass

For sshd the important lines in /etc/ssh2/sshd2_config are

    AllowedAuthentications  keyboard-interactive
and
        AuthKbdInt.Required pam

The details are as for Cosmos


02/11/12: Duplex

Changed eric from half-duplex to full-duplex networking --- or, rather, Pete Smith did by reconfiguring the switch port into which Eric is plugged. Eric simply reported as a message that the network had changed. Simple.


02/11/12: migrated non-eUMIST user to eUMIST user

Migrated Yun Steven Liem, mcdapyl to mcdssyl on Eric. Backed up files to /scratch, used Solstice to create new user and moved old files to new home-dir (well, renamed it). Easy-peasy.


02/11/05: Patched Eric

Patched Eric with the recommended patch cluster for Recommended and Security patches from sunsolve.sun.com.

Problems:

First needed to rm some stuff from the / partition to create space.

Then:

     -- got repeated failure of patches as summarised by
            /var/sadm/install_data/Solaris_7_Recommended_log
        and detailed in 
            /var/sadm/patch/*/log
        such as
            pkgadd: ERROR: checkinstall script did not complete successfully

     -- a search on google suggested that

            when installing a patch, the Solaris 2.5+ patch installation
            procedure will execute the script "checkinstall" with uid nobody.

            If any of the patch files cannot be read by nobody or if any part
            of the path leading up to the patch directory an error similar to
            the following will appear:

     -- it turned out that the directory from which I was installing the
        patches was not readable by nobody (e.g., su nobody, then pwd to check
        this) so I changed permissions of the dirs in the path and all was 
        well...patches installed.


02/11/05: Moved services from nisplus to flat files in /etc/nsswitch.conf

As at 2002 Nov 05 1347, /etc/nsswitch.conf looked like this.


02/11/04: eUMISTyfied Eric

eUMISTyfied Eric with nisplus still in place --- followed what I did for Cosmos exactly and can now authenticate to mpciish2 with both local (nisplus) and LDAP/eUMIST passwords.

Outline:

 -- use Cosmos as a template for eUMISTifying Eric:
     -- copy over libs and make (and remove) s-links as necessary:
         -- lib
     -- copy over and/or edit config files as necessary:
         -- /etc/ldap.conf
         -- /etc/nsswitch.conf
         -- /etc/pam.conf

 -- as at 2002 Nov 04, 0920, still have nisplus (not nis) but have successfully
    eUMISTifyied eric:  can login with mpciish2 with both local (nisplus) 
    password and eUMIST password!




About this document:

Produced from the SGML: /home/isd/public_html/_eric/_reml_grp/journal_eric.reml
On: 30/1/2006 at 10:12:46
Options: reml2 -i noindex -l long -o html -p multiple