network timeout problems

Mirlyn

Well-Known Member
I posted this over at HWC, but I figure I should try my luck anywhere...

I'm using a stack of 3300 series 3com managed switches on the network. There are several runs from the stack to one of our larger rooms, which is then split off using two 16-port OfficeConnects, also from 3com. These 16-port switches were installed about a week ago and we've noticed a tremendous problem with timeouts. Pinging a machine elsewhere on the stack (and not in the room) will sometimes end up with as much as 40% packet loss. Before installing these new 16-port switches, we had five 8-port DLinks in the room. Timeouts were never a problem, if they even existed.

Now machines in this room are losing complete connectivity, which causes SSH sessions and domain logins to hang for a few minutes or permanently freeze. I originally thought it was the new 3coms. I moved them to another temporary network and got the same severe timeouts. For further testing, I tried the original DLink switches on this temporary network and got timeouts as well, though to a lesser degree.

I've ran new cable around the room and still get timeouts. I changed the stack settings to 100 Full Duplex, 100 Half Duplex, and even tried enabling/disabling flow control for both FD and HD. No success. The results for each setting showed 30-40% packet loss on the 3com 16-port switches and 10-20% loss on the DLink switches.

I'm thinking it might be the cable from this room to the stack (original cat3 from building in 1992). I forced everything to run at 10 FD and HD and got better results. However, why would everything go to hell when the only thing that changed were the Dlink-3com switch replacement? Surely if the cable couldn't support 100, we would have had this problem long before when the 100M Dlinks were installed. Plus, the Dlinks ran at FD. Anyway, the original cable is still being used all over here at longer distances with very little problems. We even had gigabit running on it at one time, which makes me think its not the cable from the room to the stack.

I'm stumped. Any ideas? I've read that flow control should be disabled when going to hubs, but these are all switches. When I disabled flow control, I still didn't get any better results.

I've updated the firmware on the stack to 2.70 and still have the same problem. I took one of the new 16 port switches down to the stack and plugged it in to a run that goes to another 16-port in the lab. No problems. Pinging went fine, no timeouts. It's almost like the older 3com stuff doesn't like the newer stuff.

When the stack is set to 100FD, the uplink port on the 16-port will turn on and off irregularly, like its trying to negotiate the speed.

I'm stumped. We're going to call 3com in the morning and see what they think, but was wondering if I could be missing something. Any suggestions? The bang-head-on-wall method is starting to leave a mark. :retard:
 
it seems to me that 10Hd is the most compatible.
I run all mine on 100hd for compatibility reasons.
I was getting traffic jams with full duplex.
I have also had to completely power off everything,
and then power them back on in the chain order, if the settings
on the computer are set to auto assign ips.
If they have manual ips you still may want to try that.
 
Hmm, almost every problem I have had with switches is resolved by making sure the duplex and speeds match, which it looks like you did.

Is spanning tree turned on or off on each switch? Since you had d-link switches before, you probably aren't running and 802.1q or ISL trunk lines, correct?

I have never tried to run 100 mbps over Cat 3 cable, perhaps that is the offender. Did you re-terminate the cable, perhaps there is a problem with the rj-45 head? just a few ideas off the top of my head.

rrfield
 
Tried every single duplex and flow control setting. No luck. Now, if I put it at 10HD, it works. But there's no reason for that. 10m to a 20-machine lab on a domain is asking for trouble. This particular lab is DHCP. I changed to static (sometimes the timeouts were affecting dhcp releases/renews), and that didn't help either, still got the same ugly timeouts.

rrfield: No, these are untagged ports to a specific vlan. Very few of our ports (well, only the gigabit trunk) are tagged (for the other gigabit/vlan supported switches in our server rooms). Like you said, I'd think if the Dlinks could handle whatever was turned on, the OfficeConnects should too.

Didn't reterminate...yet. The runs are actually punched into a desk plate to each row of systems in the lab. However, I don't think its the cable. I'll try to illustrate this (because its mind boggling here too):

This is the problem now:

server
|
~
|
stack
|
to lab
|
16-port
| |
laptop desktop



Tried this:

Server
|
~
|
stack
|
16-port --- laptop
|
to lab (using the same run as above)
|
16-port
|
desktop

no timeouts from the laptop to the server. No timeouts from the laptop to the desktop. No timeouts from the desktop to the server. We tried this setup twice, both on different runs, neither had timeouts.

Confused yet? :confuse3: I sure am :D I think its something to do with the old stuff being unable to sync with the new stuff at higher speeds. I thought a firmware update would fix it, but it didn't.
 
rrfield said:
I have never tried to run 100 mbps over Cat 3 cable, perhaps that is the offender. rrfield

I missed that part.
I'm thinking that could very well be, especially if the connects are over
100' apart? I dunno
I've always used cat5.
 
the stack is actually on the other side of the wall. I'd guess the cable is 20'. :) The stack is in the middle of the building, and feeds the entire floor. We have 100FD machines in every office on the exterior wall, and none have problems, so while I hate cat3 in networking, I feel its safe to say the cable isn't the culprit here. Especially when the same cable works in a different setup.

If the connectors were bad, wouldn't the 16-port behave the same when on the other end of the run?
 
yep, I'd think so...
Have you checked to see if the ips were being assigned right,
or do you have them manually assigned?
 
Its all DHCP. I know thats not the problem. It worked fine before, the only thing thats' changed is the switch.

Still waiting for 3com to figure out whats going on. :)
 
Be carefull with CAT3 operating at 100Mbps while using the TX specification (4 lines), CAT3 can do 100Mbps but using the 6 lines.

That could be the problem for the packet loss, can you use a sniffer and see if the ethernet packets are arriving to the destiny?, perhaps those have data errors and thus the re-sending, if it has constant ethernet packets loss or data loss it can cause TCP/ICMP or any other superior layer protocol to lose packets.

Before you start demolishing the wall :D, run a CAT5 cable all down to the other end and test.
 
Luis G said:
Be carefull with CAT3 operating at 100Mbps while using the TX specification (4 lines), CAT3 can do 100Mbps but using the 6 lines.

That could be the problem for the packet loss, can you use a sniffer and see if the ethernet packets are arriving to the destiny?, perhaps those have data errors and thus the re-sending, if it has constant ethernet packets loss or data loss it can cause TCP/ICMP or any other superior layer protocol to lose packets.

Before you start demolishing the wall :D, run a CAT5 cable all down to the other end and test.

True, but if it was the cable, wouldn't it misbehave long before this? Those Dlinks have been in there a good four years.

Despite that, wouldn't this not work then?
Tried this:

Server
|
~
|
stack
|
16-port --- laptop
|
to lab (using the same run as above)
|
16-port
|
desktop

no timeouts from the laptop to the server. No timeouts from the laptop to the desktop. No timeouts from the desktop to the server. We tried this setup twice, both on different runs, neither had timeouts.

If the cable was bad, or for some reason, stopped fully supporting 100FD when we swapped the switches, wouldn't that setup stop working as well?

I'll try sniffing at the switch to see if I see anything malformed.
 
I still think that it might be the cable, it could be that the "stack" is more sensitive than the 16-port switch you are putting in between in the example you put above. I dare to say that putting it right before the cat3 cable you are eliminating the problem because the stack is now talking to the switch without using the cat3 cable and the switches have no problem with the cat3 wiring.

In theory all ethernet hardware should comply with the same spec, but reality is different.

You have nothing to lose by trying with a CAT5 cable other than at most 20 minutes.
 
i think luis has a good point about the cable, perhaps the new switches don't jive with the old switches over the existing cable. everything i've ever heard says that cat3 isn't suitable for 100mb speeds. i'm just curious (and maybe you mentioned it), when you put the extra switch in between, what speeds did it run at between the two 16-ports?

edit: just checked and certain 100bT technologies can run over cat3, may want to check the documention, see what 3com states for cable reqs
 
this is really just edit 2 of my last post but i dug this out of my net+ text.

sorry about the formatting, apparently i'm not good at teh windows anymore

100BaseTX​
—This is the version you are most likely to encounter. It achieves
its speed by sending the signal 10 times faster and condensing the time between​
digital pulses as well as the time a station must wait and listen for a signal.​
100BaseTX requires CAT 5 or higher unshielded twisted-pair cabling.​
Within the cable, it uses the same two pairs of wire for transmitting and​
receiving data that 10BaseT uses.Therefore, like 10BaseT, 100BaseTX is also​
capable of full-duplex transmission. Full duplexing can potentially double the​
bandwidth of a 100BaseT network to 200 Mbps.​
¦​
100BaseT4—This version is differentiated from 100BaseTX in that it uses all
four pairs of wires in a UTP cable, and, therefore, can use lower-cost CAT 3​
wiring. It achieves its speed by breaking the 100-Mbps data stream into three​
streams of 33-Mbps each.These three streams are sent over three pairs of wire in​
the cable.However, because 100BaseT4 technology uses all four wire pairs for​
unidirectional signaling, it cannot support full duplexing. One reason 100BaseT4​
is less popular than 100BaseTX is because it cannot support full duplexing.​
 
I'll check into getting new cat5 run. Might run into time constraints as we have to go through university computing to get it done.

tommy, It always linked at 100 FD/HD depending on how I set the stack to run.

3com requires web registration, but won't let you register with anything but IE. :retard:
 
Tommy, the T4 standard uses the 8 wires of the UTP CAT3 cable to get 100Mbps, but you need compliant hardware in order to achieve that.

The TX uses only 2 pairs of wires to do the same, and most current 100Mbps hardware uses this standard.
 
Back
Top