Thursday 9 June 2016

Juniper EX4600 switches - first impressions

We recently* took delivery of two Juniper EX4600 switches to ultimately replace our rather decrepit old Nortel Passport 1624G collapsed core/distribution switches. When they don't even bother to consistently relay DHCP messages, it's time to retire them (I had a hack involving a Mikrotik routerboard - the "swiss army knife" of the under-resourced network architect - in place to patch the DHCP issue, but I didn't like it being so). That should give you an idea of what refresh cycles around here are - the Nortels were even second hand when the school acquired them!

Juniper EX4600 with two optional modules installed
This means that we ought to expect such high end L3 switches or core routers to remain in service for 8-10 years, so we need to ensure that when we buy something like the core of the network, it's going to keep pace as much as possible with likely developments over then next decade. "Predicting the future" in tech is hard (and often futile), but networking doesn't often go through sudden massive changes (networking tech inertia is high - see how we're mostly still using IPv4?). That means you're going to want to install units that support your long term vision of the network for a long time, and will likely ensure you'll be able to deploy the technologies you're likely to need (particularly faster backbones; IPv6; adequate resilience as the school network moves from "sometimes, email is useful" to "OMG, I can't teach because the internet is down"). It also means we're probably ultimately going to lose them to old age "bathtub curve" failure, rather than a formal refresh cycle...


Pending a lot more money and a lot more fibre, I'm keeping the "collapsed core/distribtion|aggregation" design we have for the time being, so access switches will connect into one or other of the Junipers - preferably both, and within fairly short order it should be "mandatory" for key buildings (i.e. those with BYOD-centric learning functions for large numbers of people during class time) on campus to connect to both.

Whilst I've not personally used Juniper gear before, in conversations with professional colleagues they have received rave reviews. In particular, their "virtual chassis" and distributed link aggregation (MC-LAG) technology was highly praised for making the life of a network administrator rather pleasant; JunOS was also considered to be rather nice.

When we embarked on the project, we contacted possible several suppliers with a "request for proposal"; it was as vendor agnostic as possible, keying in on what features we needed from RFC-backed /industry standard features we needed supported, rather than specific vendor's proprietary systems. We received several different proposals, with kit from Dell, Juniper and Cisco proposed by different potential suppliers. In the end, the Junipers offered (by far) the best price/performance trade-off (the proposed Ciscos were the worst). At the end of the year, we scraped together just about enough money, (with some spare change pulled out from under some couch cushions) to put in an order for two of them and some optics.

Incidentally, Juniper have an interesting way of "protecting" the first reseller you contact - they will offer them the best deal on the kit cost price (as it seems each end customer is registered into a central "sales leads" database of some sort by the reseller). This means if a reseller is not monitoring their email properly, even if they're the biggest one on the continent, they may be beaten on price (by a wide margin) by a smaller one that might in the end not be able to offer the same levels of service (which you may not need anyway). Do I need to have an in-country stock replacement unit of a particular switch? It rather depends on my network design, and whether I can afford the yearly maintenance contract for that service - it's common in corporate enterprise IT to enter into such contracts, but quite rare in schools, which tend to do things on shoestring budgets (because that is, realistically, what they have - whilst slowly evolving to have "Fortune 500" uptime wants/needs).

Both of the resellers have been very helpful - even the one that knew they weren't going to get the business because of the system above took the time to make sure that we'd get what we needed through the other reseller, pointing out that they wouldn't be able to offer the sort of "quick replacement" with in-country inventory stock service levels - if we needed them. Which we don't, given the size of our budget and other more pressing needs - sure, it's inconvenient as hell if a core switch breaks, but I have the old cores standing around if I need to press something into service "in an emergency" - as fairly dumb L2 "fibre switches".
Nortels (particularly the now ancient Baystack line) are like the AK47 of the networking world. Clunky, not particularly attractive, but you can bury them in a muddy field in Siberia, dig them up 5 years later, and shoot someone. With varying degrees of accuracy. Every day in IT is a balancing act between keeping your "invisible" backend systems reasonably stable and up to date, and the more visible "front end" infrastructure people more often moan about (frequently, they are even justified in doing so).  My "gut feel" is that too many resources have been pumped into the front end systems as "once off" expenditures, and the backend (and ongoing maintenance budget) hasn't really had enough love; essentially, it's indicative of Demand Level 200, Resource Level 20. Something isn't going to be happy (and not just the IT manager). I think I'm slowly winning the required resources battle, but many of the gains have been offset by the financial situation in the country (specifically the execrable exchange rate with the US$, which deeply affects all tech pricing and how far your budget stretches; my perceived "what it should be" value is around R6-8 to the dollar; at the moment, it's less than half that (around 15:1); this year has been horrid). Given the modest increases and the yawning gulf between needs and facilities, there will be many sad faces for many years, unless we suddenly find 500% more money (dream on, buddy). I think the school is recognising that if tech is going to be so central to teaching and learning, it needs the resources it needs, and they must be found. Sometimes it helps to have "the money person" as your line manager, too - at least once you learn to speak their language. The "Red Queen" running flat out just to stay in the same place nature of tech doesn't win you many friends! It's as good an analogy in tech as it is in evolutionary biology.
I also received a follow-up call from an official Juniper representative to make sure I'd been "properly taken care of" by the resellers, who received rave reviews as a result of being generally helpful - even when they didn't know whether or not they'd get any business.

Having been in the position of inheriting a network where too many pennies were scrimped (almost certainly "to get at least something going"), leaving you with kit that is almost obsolete when new (why buy a 10/100 PoE switch when the gigabit ones are only one or two percent more in some cases? False economies are real.) I did not want to give myself - or any successor - buyer's remorse in 5 years time. This meant, even though we don't yet have a roadmap for IPv6 deployment, all gear needs to support it; although we didn't (until very recently) have any gear that supports a 10Gb/s connection/uplink, it is the current enterprise network backbone standard and therefore what we should get (or better - particularly as we could fairly easily have a 10Gb/s Internet link); and we also had a host of requirements for reliability and resilience at that network layer (obvious things like various Spanning Tree modes; routing protocols (I happen to like OSPF); VRRP and the like; distributed link aggregation and all that good stuff). The Junipers met (or exceeded) all of these requirements, and after much hand-wringing ticked the most boxes vs. the cost.

Not all schools are going to require a 10 gigabit backbone today, or even in 5-10 years time - it all depends on the services you need to offer to enable the teaching and learning that your school - and its teachers - want to deliver, and how much of that requires access to "first world, corporate" internet and LAN speeds. I would contend that this is likely to be the majority within a few years. Of course, we may see the establishment/marketing of schools that support "anti-tech" parent choice one day soon - if the business people are smart, they might use this to compete in the marketplace when their school is too far away from major population centres to make "gigabit plus" internet a reality. Or such schools need to be very canny about caching and localising content and NOT going the cloud route.

If you'll allow me a digression from the subject of networking, spurred from the concerns I hope the previous paragraph might raise in the reader, I would strongly contend that one of our roles as IT professionals in education should be - to the extent that school structures allow it - that you "subtly" steer (or simply "facilitate") teaching and learning which integrates tech to be more effective than teaching without it. We must carefully also consider that most of us are probably not teachers (certainly not "qualified" teachers) and have little experience of teaching, so we have to be extremely careful what we might do in this process. I'm blessed to have a "tech evangelist" as the Director of Academics in the Senior School, and one who is deeply engaged in the conversations around what tech should (and shouldn't) do in education - indeed, in the profound "what is education?" discussion. I'd suggest that at least some of your own self-paced learning/continuing professional development should be engaging not only with the tech literature and discussion, but also with that of the user community you serve - education.
As an example of this "subtle steering", in the Junior School, we eventually identified some "champions" to drive projects there (like a pilot iPad deployment) - contrast a position of "we just need to have iPads" from management with that of a teacher coming with a host of lesson plans and ideas, having identified their own perceived "weaknesses" in tech, and having already identified ways to work around them (like finding an experienced expert to call on) and asking if we could possibly find some iPads. One of those two will "work"; the other one will end up with some expensive toys sitting on a shelf gathering dust and rapidly aging into obsolescence - for close to the price of a badly needed new server or network switch or new fibre route. I must admit that, aside from the budgetary challenges of buying the required number of iPads when they were initially requested, I categorically didn't want them to be a white elephant, and was waiting for that level of engagement - eventually, my "hints" worked, and we got to a workable model - which even included a pilot lesson with "scavenged" iPads - which by now is probably working quite well (it's been a busy year so far, so I've not had time to play "fly on the wall" of late; accidents of geography tie me - and informal corridor conversations - more to the senior school than the junior school).
At the end of the day, whether selfish or not, we IT professionals should hopefully ensure that parents who believe "anti ed-tech" is a good stand-point should be (justifiably) viewed by most of society in the same light as the "anti-vax" movement or those that seek to remove evolution from biological curricula, in part by making sure the benefits far outweigh the "risks", and that adequate safety controls and regular assessments are in place. Ideally such views should be based on certainty stemming from the same amount and type of rigorous evidence as vaccination has behind it. Indeed, it should be so "obvious" that parents never even consider that education without tech might be better. I think we're still very much at the stage where it can go either way - the ed tech systems aren't "good enough" to not need teachers to drive the process (in the same way that there are no other "teaching aids" that replace excellent teaching!), so "ed tech" has to focus on teachers in these nascent stages (not necessarily so much on students, but that is of course part of the path). Certainly, we're still at the stage that how well it works depends on the specific classroom you examine. We do that by partnering with teachers to ensure that teaching and learning that incorporates tech greatly enhances that project - and better prepares students to meet and conquer many of the challenges in their lives. We help by trying to push for more concrete statistics and measurements (without unduly adding to "assessment load") to support ed tech being a positive and good move. And if ed tech gets in the way, it's making teaching and learning worse. Make absolutely sure that doesn't happen - training (of staff and students) is key; if we spend years learning to use a pen or pencil, we should probably spend at least as much time "getting to grips" with technology, which is the "pen" of the modern society - and the stage in school where we start to "learn tech" should probably echo that of handwriting. Reliable systems (including perhaps more formal IT processes than are "traditional" in many schools), and architectures that support stability and "invisibility" of tech, are vital. These are very complex issues and there will be no quick and easy answers to them, but we must seek them out. And now, back to Juniper!

At the end of the day, a single Juniper of this type costs roughly the same as the last Dell server we purchased, which is not, in my opinion, a "bad deal", given the network-centricity of modern IT and the capabilities of the platform. It's a big chunk of my annual budget. Two of them is a really big chunk of change, but finally, I've met one of my initial priorities to get things on track towards a modern and stable LAN, a year and a half after starting at the school. Yay!

At the moment, we'll be deploying a dual 10Gb/s virtual chassis link between our two datacentres; in the future, once we have more fibre, we will probably investigate doing this across the 40Gb/s interfaces. Doing this on our existing OM2 multimode fibre necessitated some rather expensive LRM SFPs and mode conditioning patch cords. This also means from now on, we're going to be installing 10Gb/s interfaces in our servers and networked storage, wherever supported. And, unless we have a very good reason, it's goodbye OM2.

So now, it's time to learn JunOS...

For the most part, it appears to be fairly straightforward; the Juniper documentation is excellent and the "Day One" series of guides provide "recipes" that mostly get you up and running quite quickly. Creating a virtual chassis is quick and easy.

However...

The one thing I've struggled to find documentation on is getting the ports on your backup master / linecard switches to work! Interestingly, when you create a virtual chassis, the system doesn't automatically configure all the ports (i.e. your configuration doesn't magically get expanded by n ports; it is essentially completely blank for everything except the ports on the master switch) - you have to manually configure them yourself. It is surprising that the documentation is not more explicit on this, particularly in the Day One guides (or perhaps, I skim-read too much too quickly). This is not an unusual situation in high-end gear, where assumptions made by relatively dumb code tend to mess things up (so we often actively don't want things to be "clever") - they typically expect a "clueful" network administrator to go in there and configure them exactly as wanted. (In a way, it's not dissimilar to Cisco devices coming out of the box with all the ports in shutdown state).

It took me some head-scratching and a good night's sleep to have the sudden inspiration to not try adding the interfaces on the second unit to "unit 1" (they're still considered "unit 0", even though physically they're on the second unit [which would be 1 because numbering starts at 0]). You'll get commit errors when you try configuring them as unit 1. If you think about it, it sort of makes sense if you consider the VCP as a big chassis switch that just happens to be in two (or more) different places.

In other words, I've now taken to thinking of "unit 0" as "the virtual chassis" although that's probably not exactly what JunOS thinks it is); it's essentially a placeholder to make their interface naming and grouping fairly consistent across platforms. Pretty much every interface statement you're going to make will need to include this in some way, so get used to typing unit 0 a lot.

Like all high end platforms I've seen, Juniper names their interfaces in a "modular" fashion, so instead of being something like Ge24, you'll end up with xe-0/0/23 (a 10Gb interface has the prefix xe; Juniper starts counting at 0 [so do high end Cisco routers; I found it kind of amusing that our connection to our upstream ISP at my old job was on a port that was (I vaguely recall) 0/0/0/0 on their Cisco; that's also where I learnt about how sometimes speed/duplex auto-negotiation... doesn't]). This also means that the ports on your second unit will be xe-1/0/n and so on and so forth. The first "number" is the member number in the Virtual Chassis (VC) arrangement (strictly speaking, its the "fpc", which on this platform is effectively synonymous with the VC member number); the second number does not change, unless you install expansion cards (it's usually a sub-interface board on high end platforms; this remains 0 in the non-modular gear; strictly speaking it's the "pic"); and the last number is the physical port number (which is labelled on the switch if you're not sure how the numbering works). If you've ever seen a modular chassis, it makes a lot more sense, but may be a little confusing if you've only ever worked with small, standalone switches. Remember that Juniper are trying to have a consistent OS and configuration across pretty much their entire line; whilst that might be overkill on a little access switch, it means that once you understand the syntax and naming conventions, it works on EVERYTHING they make (with the odd tweak in some commands).

So, if you get stuck with the configuration of your non-master Juniper switch ports on this platform in this configuration, remember that they're all Unit 0, and remember the Juniper interfacing naming convention.

You'll also find that the EX4600 are sort of hybrids between the EX and the QFX command set. Sometimes the day one guide for EX switches doesn't work, and you need to edit the command to be a bit more like the QFX equivalent (cue much googling of "doing the thing EX4600"). I wrote notes into my dead trees printout of the Day One guides for the commands actually needed which I used whilst configuring it; I'm sure I'll thank myself in later years/months. Such notes should end up in our in-house documentation wiki too...

One thing about the command syntax that is really cool is that you can copy "stanzas" around very easily, which makes repetitive configuration simple. You can also rename interfaces - so if you change a gigabit SFP out for a 10 gig one (or vice versa), you simply rename the interface, and it becomes a working 10 gig interface with no other configuration changes (but it won't automatically change this for you - if you want that sort of level of "automaticness", define both of the interface types with all the relevant configuration; it will make whichever one is relevant active - it makes your config rather longer and less elegant though). And being able to make a whole bunch of changes, and then commit them when you're finished (and have various levels of backup configs you can revert to if it all goes horribly wrong) is pretty nifty. Effectively, it's a very specialised PC running BSD with a bunch of "lets be good routers/switches" features integrated on top. Such a design means you should carefully consider what you're doing so that packets generally don't end up getting processed by the CPU, but remain in the specialised ASICs for speed.

On the other hand, so far, my experience of Juniper tech support has been terrible. I still have not persuaded them to even let me properly register myself or my devices through the support website (after more than a week), nor can I even read the "secure" messages tech support send me through the website, because I am denied permission. *head*, meet *desk*. "Yes, I already sent you my serial numbers, here they are again".
And again.
And again.
And I'm still not properly registered, in June, some four months or so after asking several support representatives to please make my support account work.
I've sort of given up, in fact.
One of these days I'll probably get around to asking the reseller to  push some buttons with their contacts to make it happen.
I hope if I ever need JTAC for something less trivial than account and device registration, it will actually work...

No, you can't use our support website. No.
No.
Never.
Screenshot from 9 June 2016

*"recently" was when I first started to write this post, way back in January...! Since then, I've learnt a bit more about the switches, and I'm very happy with them to date. Apart from the tech support. (And yes, after I talked to the reseller, they managed to get things moving and I eventually got a working Juniper login!). 

5 comments:

  1. I worked for Juniper for 10 years as a Consulting Systems Engineer and their products are great. The Virtual Chassis is an amazing concept that works really well. Being able to manage up to 10 devices as a single unit brought massive benefits to large scale switching deployments by significantly reducing the operational overhead. In fact before I left there was talk of boosting this up to 32 but that may have been on the QFX series of switches. JUNOS itself has many great features such as the "replace pattern" feature. Using this you can from anywhere in the configuration hierarchy choose to replace any repetitive string with some other value. Also being able to simply copy one interface to another or simply rename an interface is a huge time saver and of course there is the commit confirm function that automatically and completely hitlessly will revert a misconfigured device back to the previous working config WITHOUT having to reboot the device (unlike Cisco's "reload in" feature). When configuraing a Juniper box you are configuring the candidate configuration. This has ZERO effect on the underlying running config until you type "commit". This means you can be making changes all through your working day without affecting the stability of the network, Then at the end of the working day simply commit and job done. Each Juniper device stores up to 50 previous configs on each device AND has the FULL Juniper configuration guide of every single box. The latter being great when you are working late at night in a cold Data Centre without Internet access and need to reference a command. Yes their support could be better but having worked for Cisco previously and now working for another large vendor I can promise you they ALL have problems with support from one degree to another. Best of luck with your deployment.

    ReplyDelete
  2. Thanks John - your experiences echo my own on the platform, and since first drafting this post, I've experienced numerous of the "nice features" you list in your helpful comment. I seem to recall you can go up to about 10 units in VC configuration on most of the EX series - it depends on the platform, though, as you mention. Pretty handy for mid-sized campuses.

    Incidentally, the regional "Inside Territory Account Manager" phoned me on Friday (after I asked my VAR if there was anything they might be able to do about it) to look into my issue logging into the Juniper support area; it looks like the account I initially created only had "guest" permissions, so I hope it will be resolved soon. Quite why that was beyond several frontline tech support people is mystifying, but we all have those days, I guess! I think it also demonstrates that they take these kinds of issues quite seriously.

    How do you access the on board documentation? I haven't come across that feature yet. Or is it simply a "man this" kind of situation? Junos documentation is very easy to find online - but that makes finding how to get at the onboard stuff trickier through google!

    ReplyDelete
  3. I simply refuse to go back to Cisco purely because of the poor structure of IOS. Cisco make good hardware there is no doubt about that, but their operating system... yuck. Coming from a programming background before going to networking many many years ago, the hierarchical configuration of Junos is considerably better (also means copying and pasting between clients configs is a breeze!) It is just completely logical when finding areas of the configuration that need to be changed. Cisco IOS has simply not changed enough in the 20 years I've worked with it to be competitive. The JUNOS rollback and commit confirmed commands alone are absolute life savers and worth going to Juniper. Altering a running config on the fly is just plain stupid, even developers have to compile their code before running it. Juniper virtual chassis (much like Cisco stackwise) is considerably better too, it is a faster backplane compared to similar Cisco model switches, it has far better troubleshooting ability, the naming convention of ports is logical etc. I also love the idea of having a "rescue config" on Juniper too - pretty much your "gold image" which you know you can revert too if you ever stuff up your configuration. JTAC (Juniper support) is fantastic - and I think its got better and better as the company has grown. Cisco IMO has not. All in all Juniper is just freaking good enterprise level networking hardware at a reasonable price that just works. No wonder more and more Telcos are preferring it over Cisco kit.

    ReplyDelete
    Replies
    1. Indeed - my more learned colleagues at the local university also come from a programming background and waxed lyrical about these same features.

      I come from a life sciences background and somehow shunted across into IT, so I just take things as they are - my introduction to enterprise network switching was on Cisco, so it "makes sense" - now, at least; Juniper syntax and config didn't exactly take much getting used to and seems superior, once you figure out a few key things.

      And the Juniper Day One documentation is generally fantastic.

      Delete
  4. Incidentally, Juniper are now (2020) giving away all of their associate-level qualification training, and offering free certification. Formal study of the platform makes things like Unit 0 make a lot more sense.

    I've since done JNCIA-Junos and JNCIA-DevOps, and will likely do others and start the higher level ones RealSoonNow.

    Also, at my subsequent day job, I ended up working with a LOT of Junipers (EX and MX), which, on the whole, get the job done quite well, and I'm familiar enough with Junos that other routing CLIs often seem primitive and wrong.

    ReplyDelete