tag:blogger.com,1999:blog-59499549514025793612024-03-19T04:51:15.367+00:00School SysadminThis blog follows my exploits as a one time Network Architect / Systems Administrator / IT Manager at a university in South Africa.
When you've RTFMd, and it didn't help, WABM!James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.comBlogger53125tag:blogger.com,1999:blog-5949954951402579361.post-9726260178234892682022-10-27T00:10:00.010+01:002022-10-27T00:26:32.492+01:00Unsucking Wi-Fi - a quick and dirty guide for the neophyte wireless engineer<p>Recently, some of my colleagues have been coming to me for advice with wireless. There is a TON of information you need to know to get good at wireless networking, but there are also some quick tips that can get you quite far, quite fast. </p><p>The intended audience is a technical person who's somewhat unfamiliar with wireless, or seeks to quickly learn a bit more. If you want an even more basic primer, check <a href="https://schoolsysadmin.blogspot.com/2020/03/tune-your-home-network-for-work-from.html">my previous article</a>. </p><p>This article <i>doesn't</i> particularly cover external Pt(M)P networking that may be used to cross a campus or run a WISP - it's focused on wireless supplying end clients (i.e. "traditional" in-building wireless coverage). </p><p>It is not going to be the be all and end all of wireless (it's not several thousand pages long, for a start), and some things are context-specific, and guidelines are sometimes best broken, once you understand more; realise this is, at best, a primer to going off and doing (a lot) more reading (and careful experimentation!).</p><p><i>Read on...</i></p><span><a name='more'></a></span><h1 style="text-align: left;"><i style="font-size: medium; font-weight: 400;">Note: this post is a work in progress/living document, and it likely to get more work over time. <br />Last updated 28/10/2022.</i></h1><h1 style="text-align: left;">First of all...</h1><p>Does it really need to be on Wi-Fi? <b>#WireAllTheThings</b></p><p>Some sage once put forward a simple statement: "<i>Wire what you can; Wi-Fi (only) what you must</i>" - in other words, if it doesn't need to be on the wireless, <i>don't use wireless</i>. </p><p>Anything that is literally fixed in place (desktops, docking stations, AV equipment, etc) that needs network access should (MUST!) be wired, and if that equipment is missing an Ethernet port, it is not fit for purpose. Things without NICs that have USB can easily rectify that problem with a suitable USB device. Save scarce radio spectrum for applications where you CANNOT possibly fulfill a connectivity need any other way. A previous workplace enshrined these requirements in the network connection policy, and they helped enormously. Every device you take of the Wi-Fi saves room for one that can connect in no other way. </p><p>Provide wired options wherever you can, and warmly encourage their use. If you're fancy, make sure you're using things like 802.1X and associated NAC to keep things nice and secure. </p><p>Things not on wireless firstly (mostly) avoid any interference from anything else - they have "wire speed" service practically guaranteed. Furthermore, their actions don't typically tie up all the local network resources; a single wireless client can completely tie up an AP. In contrast, a modern wired network switch can handle pretty much every single port flat out, so long as there is enough uplink bandwidth. Wired (or even fibre) connections are particularly valuable for things that don't do well with latency, jitter or packet loss, or require very high bandwidth throughput (because of things like TCP congestion avoidance algorithms, bandwidth delay products, and of course, things that are grumpy about latency, jitter, out of order packets and packet loss, like VOIP or videoconferencing). Contrast the speeds you can get out of singlemode fibre with 5GHz wireless, and you'll rapidly see that shared spectrum radio is a horrible solution at scale. And the fibre is full duplex, unlike the radio.</p><p><br /></p><h1 style="text-align: left;">Give me a picture of ALLLLLLL the things I need to think about</h1><p>This other network blog has a diagram I've found quite useful in the past: </p><p><a href="https://wirednot.wordpress.com/2016/03/19/the-soon-to-be-famous-cocktail-napkin-wi-fi-big-picture/">https://wirednot.wordpress.com/2016/03/19/the-soon-to-be-famous-cocktail-napkin-wi-fi-big-picture/</a></p><p>By the author's admission, it's not 100% complete or accurate, but it is a good guideline to figuring out wireless issues top to bottom. Save a copy somewhere, or print out a couple of copies for you and colleagues. </p><p><br /></p><h1 style="text-align: left;">Cocktail parties everywhere, but not a drop to drink...</h1><p>People often talk about many of the problems in wireless networking through the example of a (cocktail) party - there's so many conversations around you, you can't make out any conversations. As someone who really struggles with auditory discrimination against noise, I get this on a visceral level. </p><p>Many wireless problems come down 3 "noise-related" problems:</p><p></p><ol style="text-align: left;"><li>Straight up noise (not wifi, just other RF emissions in the frequencies of interest); </li><li>Co-Channel Interference (CCI - other wifi users on the same frequency and channel width); </li><li>Adjacent Channel Interference (ACI - neighbouring wireless "channels" that overlap with yours).</li></ol><p></p><p>As a first step, make sure you're not unreasonably suffering from ACI. This is a classic problem in 2.4 GHz wireless, where all too often, devices will settle on channels other than 1/6/11 - find them, and fix them if you can. </p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4gJMZbXDwX7F1gNXYRdfzLArmdV0mZk8bQVSmVGpuYRQgll8Yx1UT1iPcXVZMZW_skkM3lwam7_G-MoaBYDpPHg3_nYgVbHlzwluM0dYsFs7mZ2jygv8t23Sa4ubkIPtOdA6TZGbPqeva9GG-gWD3bKO0I1J-NxDAvMRBY_T8lmJeXnu9ZYYeDb-j/s1179/ofcom_wifi_6ghz_band_use_example.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Wireless spectrum chart for UK region" border="0" data-original-height="747" data-original-width="1179" height="253" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4gJMZbXDwX7F1gNXYRdfzLArmdV0mZk8bQVSmVGpuYRQgll8Yx1UT1iPcXVZMZW_skkM3lwam7_G-MoaBYDpPHg3_nYgVbHlzwluM0dYsFs7mZ2jygv8t23Sa4ubkIPtOdA6TZGbPqeva9GG-gWD3bKO0I1J-NxDAvMRBY_T8lmJeXnu9ZYYeDb-j/w400-h253/ofcom_wifi_6ghz_band_use_example.png" width="400" /></a></div><div class="separator" style="clear: both; text-align: center;">Here's a basic example for the UK (image from <a href="https://www.ofcom.org.uk/__data/assets/pdf_file/0036/198927/6ghz-statement.pdf">Ofcom</a>)</div><div class="separator" style="clear: both; text-align: center;">Keep a chart like this handy when you're trying to figure out what channels to assign where! <br />Obviously, use one for your country.<br /><i>Click to enlarge.</i></div><p></p><p>There is little you can do about noise, ACI or CCI from networks or devices you cannot control - networks in shared or closely co-located spaces inherently mean you are <i>not</i> in control of everything, and the proliferation of things using wireless as a connection mechanism mean an increasingly hard-to-operate environment - bluetooth on everything, wireless printers and other IoT <strike>garbage </strike>devices, other uses of ISM bands, and so on. The only way to "win" that war is to be louder and closer to "your" devices - that is, having access points closer to the clients you're trying to connect to. Remember, radio waves follow an inverse square power law, so distance is king - be close! Read the <i>High Density</i> section later for more on that. </p><p>It's time to have a look at the RF environment you're faced with. Hopefully, you have a vendor that supports some way of visualising the RF spectrum around the access points at least. (If you are at a properly equipped company, you should have more advanced equipment and software, like the stuff Ekahau sell). Even without the fancy stuff, you should get some indication of roughly what the RF environment "looks like" to your APs with any vendor that doesn't completely suck. Even the most basic will tell you what other SSIDs they can see on what channels - not a brilliant option compared to actual spectrum analysis, but better than just guessing channels. </p><p>Check to see that:</p><p></p><ul style="text-align: left;"><li>Your RRM (radio resource management) "auto" settings aren't being brain-dead. Earlier today, I saw 3 neighbouring APs using the same 2.4 and 5 GHz channels. Yes, from a major vendor. No, I was not impressed. Yes, the site was expressing that their Wi-Fi "sucked" around those APs. Things got markedly better when I told the system to use 40 instead of 80 MHz channels and to redo the RRM calculations. </li><li>You are using a "quiet" channel in that location - many vendors will either have a full time scanning radio (invaluable) or allow you to temporarily retask a wireless radio to do some basic spectrum analysis (at the cost of dropping clients) - these are VERY useful features. </li><li>That quiet channel doesn't overlap with others (so in 2.4GHz, there are only 3 valid channels in most countries - 1, 6 and 11 - any channels in use other than that are causing issues to the channels either side of that). There are good diagrams of this online. </li><ul><li>In 5 GHz, that's slightly more complex, but the good news is there are more channels to choose from - but not if people use outlandishly wide bandwidths. Upcoming 6GHz will give us more spectrum - once we have client devices that can use it. </li><li>Valid channels change by country, and also over time, as regulators update national permitted uses, such as OFCOM relaxing the DFS rules for some of the 5GHz channels, or adding additional radio spectrum for WiFi (like 6GHz). </li></ul><li>The bandwidth you use is appropriate. Using narrower bands (i.e. 20MHz instead of 40, 80 or 160 in 5GHz) means you can assign more unique "cells", which helps - completely clean channels are WiFi nirvana. Some authors have also claimed that narrower channels also help "punch through"or are somehow easier for radios deal with interference than wider ones, but I've always viewed this with some skepticism. Still worth a shot though, and aside from those claims, using spectrum more efficiently is usually a net gain! Be aware that more non-overlapping 20 Mhz APs is often better than fewer 40 or 80 wide channels (forget 160, it's unusable in the real world). Where client devices only support 20Mhz, the rest of the band is wasted (later standards do interesting things about this with things like MU-MIMO, or 802.11ax's RUs, where supported). </li><ul><li>for 2.4, 40MHz wide is a travesty and should never be used. You only have 3 channels with 20MHz bandwidth; with 40Mhz, you're down to one (because the other overlaps no matter whether you use 1 or 6/11). </li><li>the trade-off, of course, is speed - more bandwidth = more data per second, so your throughput decreases as the bandwidth assigned goes down. However, you will generally see consistently better performance overall with more APs that don't overlap serving smaller numbers of clients per radio. </li></ul><li>If you can, ABSOLUTELY make use of DFS channels, but be aware of "DFS hits" particularly if you are near to aviation users and/or weather radar stations. </li><li>Consider adjusting the TX power to a suitable (usually lower) limit</li><li>Balance power between 2.4 and 5 GHz wireless (typically 2.4GHz lower TX power than 5GHz) to help encourage devices to use 5GHz. Consider where to disable 2.4 entirely on some APs in high density designs. </li></ul><div>Some other tragic things to check: </div><ul style="text-align: left;"><li>Get hold of nationally relevant permitted frequency bands for 2.4 and 5GHz Wi-Fi (this varies per country); make sure any details around indoor vs. outdoor and other permitted uses (i.e. fixed broadband access, etc) are suitably covered by your vendor - and your settings (it is your problem if your AP is doing something illegal in your jurisdiction, not the vendor's). </li><ul><li>Remember: settings for interior vs exterior client APs may be different</li><li>Remember: settings for wireless bridges (P2P or P2MP) may be different</li><li>If you're messing with antennas, be aware of EIRP limits. </li><li>Vendors will often call these something like "regulatory domain" - find somewhere you can set a country, and do so! </li></ul><li>Turn off wireless on things that don't need it - wireless printing features are a common fail here. Don't clutter those precious airwaves with junk. Most models will allow you to permanently disable the feature. This becomes increasingly common each year with IoT devices, most of which broadcast some kind of wireless network before they associate with yours (and some continue to do so). </li><li>Consider suggesting that the use of personal hotspots is against good practice (I've seen it cause chaos in shared accommodation). </li><li>Verify AP placement isn't brain dead (inside or behind a metal cupboard or similar obstruction), or hidden somewhere in a corner, or incorrectly mounted for the coverage pattern of that model (APs can be mounted in the wrong orientation or too high or low off the ground - RTFM!). </li></ul><p></p><p><br /></p><h1 style="text-align: left;">You're holding it wrong</h1><p>There was a time when you could literally hold a certain Apple device wrong and it would cause issues. This is still the case, and not just with that model. If large bodies of water (AKA people) are between their device and an AP, they WILL be blocking RF. If their device is in a metal drawer or at floor height, they may not find things work as expected. </p><p>Cradled in hands, it's possible the very act of holding a device may block enough signal (particularly in a marginal coverage area) to render it unusable. </p><p>Move around and see if it gets better!</p><p><br /></p><h1 style="text-align: left;">Move your ass</h1><p>The vagaries of RF propagation can mean moving may resolve a multitude of sins. Since the dawn of MIMO and antenna diversity, things like multipath and destructive interference have gotten less bad (and spatial streams make multipath a <i>good</i> thing), but they still may cause issues in challenging environments. Always get a user reporting problems to go and reproduce their issue close to a "known good" AP - if possible, <b>verify</b> they are actually associated with that AP before they attempt to reproduce the issue. </p><p>If a device has less than say -65dBm RSSI (i.e. a number higher than -65, like -80 or even worse), they're <i>not</i> going to have a good time if they expect high bandwidth, low latency and low loss. -65 dBm or closer to zero (like -50) is better; I don't consider a client getting worse than -65dBm adequately covered. </p><p>This can also cover re-positioning APs, where they may be sub-optimally located to properly cover clients. In some cases, re-positioning an AP that's out of optimal positioning may be required. </p><p>Obviously, if there are legit areas users need to be in that don't have coverage, that will only be solved by adding APs. </p><p><br /></p><h1 style="text-align: left;">Not all waves are equal</h1><div>You should be aware that 5GHz is generally associated with better wireless (more or less entirely because there's way more spectrum available than at 2.4GHz), but you must also be aware that on the whole, 2.4 GHz frequencies are much better at going further than 5GHz frequencies, so changing power (TX) levels between 2.4 and 5 GHz radios can be helpful in most designs, as well as sometimes entirely disabling 2.4 GHz on some radios/APs.</div><div><br /></div><div>6GHz is going to play an increasingly big role, where supported and available (and yeah, you'll need new APs and recent clients or new cards or dongles). </div><div><br /></div><div>Emerging alternatives to Wi-Fi like CBRS look really interesting (think basically "private LTE/5G"), and uses different spectrum (3.5-3.8GHz or so) - again if your clients can support it, but it may be ideal if you need to make a "it absolutely must work" wireless network in areas that are utterly swamped by competing wireless systems, like office buildings and shopping centres and the like. </div><div><br /></div><h1 style="text-align: left;">Eww! It's STICKY!</h1><p>From time to time, you will come across wireless issues caused by "sticky clients". </p><p>A common scenario is that devices stay associated with (are "sticky") with an AP they first pair with when a device enters a building or goes down a corridor - even when there are much better APs in locations the device ends up in shortly afterwards. </p><p>Check to see if particular APs have a lot more connected devices than you might expect - take a look at things like APs near to entrance doors to buildings, in particular. </p><p>One of the travesties of wireless is it is mostly controlled by the client devices themselves, NOT the wireless systems. (Cellular clients, on the other hand, are controlled by the network - look how much better that typically works!). This means that clients will often do things that are, if you have a better view of the network, rather silly. There are some hacks you can try to adjust this behaviour.</p><p>For example: </p><p></p><ul style="text-align: left;"><li>They'll stick to *that* AP far longer than you might expect, even if there is a "stronger" one in range.</li><ul><li>Careful layout of APs and TX strength adjustments can help control this. </li><li>Extensions such as 802.11 r/k/v may also help, but be cautious in deploying them around old client kit</li><li>There may be vendor-specific hacks you can try</li><li>There may be minimum client data rate (preferred) or minimum RSSI (sometimes problematic) settings you can use that make the AP de-associate a client outside of reasonable levels. </li><ul><li>Minimum data rates are often very helpful; anything less than like 11 or 12 Mb/s is ridiculous, unless you have to support really legacy stuff. </li><ul><li>higher than this may well be desirable in high density designs</li><li>if you have to support really legacy stuff, your life sucks.</li></ul></ul></ul><li>They'll stick to *that* AP because it supports a more modern wireless standard, or has a fancier radio with more spatial streams</li><ul><li>It's generally best to stick to one make, model and generation of AP within a building</li><li>Adjusting (reducing) TX strength settings may force earlier roaming as the signal will drop off faster (but you may need more APs to cover a given area)</li><li>make sure the "best" APs are used in the most sensible locations</li></ul><li>They'll stick to *that* AP because it is using a wider channel width</li><ul><li>Use the same channel width throughout a building or campus. </li></ul><li>Suggest users turn their device or wifi off and on again "when they get to their desk (or room)" - this can frequently force a client device to find an actually better AP. Not to mention this often helps poke slightly b0rked client devices and WiFi stacks. </li><li>There are more reasons, of course, most vendor-proprietary - but many of them aren't anything you can do anything about. </li></ul><div>You might find learning about the "green diamond" helps you understand some of this better. Try https://wlanprofessionals.com/greendiamond/</div><div><br /></div><p></p><h1 style="text-align: left;">"That AP predates the founding of some <i>countries</i>!"</h1><p>Ancient wireless systems should be gracefully retired, or inadvertently hit with a heavy, blunt object until they are replaced (I jest, property damage is bad, mkay?). </p><p>If you're still using 802.11n or earlier access points, you have a very strong reason to motivate for modern APs, particularly if you have to support lots of users doing high bandwidth stuff. <br />Should you go for 802.11ax (or WiFi 6)? <br />Probably, but definitely don't get anything older than 802.11ac wave 2. <br />WiFi 6E is basically WiFi 6, with the addition of 6Ghz spectrum support - which will be great until everyone climbs on that frequency too (and of course if your client devices even support it). That's (as of the time of writing) the latest, greatest option.</p><p>With everyone having gotten used to videoconferencing as the "norm" for communication during the pandemic, everyone now does this, and it's an enormous strain on wireless. In very light usage scenarios (one or two users occasionally doing email and browsing non-video websites), old APs might still work, but you need to move them waaaaaay out to the edges of your coverage where they might remain "fit for purpose" - but on the whole, if it's that old, you want to get rid of it. </p><p>High Density designs are what you should be providing, designing or otherwise advocating for. </p><p>You may also find some older designs do things like having ALLLL the APs on the same channel (!) - this was quite popular with some vendors for "seamless" migrations between APs. This is <i>horribly</i> antiquated!</p><p>It is also hypothetically possible that an old, unsupported AP could become illegal if your local regulatory domain changes its mind about what is allowed (and you don't take action to prevent it doing the no longer allowed thing). This is one reason it's vital to make sure that country is set correctly on your devices, and where there are regulatory domain updates in firmware for your region/country, you apply them. And, quite aside from the performance and security issues at stake with EoL devices, give you yet another reason that you can't support things that the vendor themselves no longer support. </p><p><br /></p><h1 style="text-align: left;">RSSI - can you hear me? </h1><p>Remember, wireless is two way. </p><p>There's what the AP sends and receives, and there's what the client sends and receives - both will be affected by the RF environment by various environmental factors with may not be equal in both directions, and device-specific issues around antenna diversity, gain, effectiveness, along with transmitted signal strength, and radio receiver sensitivity (and discrimination) and so on. </p><p>Make sure you have a good handle on what both sides of the equation look like - you may need to go over there and literally take a look; having a "standard" device that is used to survey things can be quite useful in determining if a given location is "adequately" covered or not (-65dBm or better), but most basic mobile phones will have apps that will give you "good enough for government work" answers to signal strength in a particular area. </p><p>Take a look at signal strength indicators if you can, and use that insight to help understand, diagnose and resolve issues related to signal strength. </p><p><br /></p><h1 style="text-align: left;"><b>That's <i>too</i> busy!</b></h1><p>The "busy-ness" of a given channel can have a very significant effect on how well wireless operates for the end users. </p><p>If the airwaves are congested, inherently, you've got a wireless traffic jam - there isn't enough free space to get more data delivered. This can be from a mix of "your" wifi, as well as ACI and CCI from other neighbouring APs (yours or another network's), as well as any non-Wi-Fi "noise" that may affect things. </p><p>Wireless is very much a victim of its own success. Getting clients to send or receive their data as quickly as possible is key - and, of course, if you simply have too many clients wanting too much data, the airwaves will be busier than ideal - so you need to spread them out (across more APs and different channels). </p><p>This is where things like speed limits (shaping) can actually make things worse!</p><p>Or, you know, get some heavy users OFF wireless - provide wired "hot desks", identify heavy users and help them get wired up, and so on. </p><p><br /></p><h1 style="text-align: left;">There's a limit to the usable number of connected client devices per radio</h1><div>Any decent vendor will give you an indication of the number of client devices that can (under idea conditions and certain performance envelope) connect to a single radio. This number is, on the whole, around 25. </div><div>This will reduce where you have old clients, weakly connected clients, interference, and other negative influences, or particularly demanding (continuous high bandwidth) applications. </div><div> </div><div>If you have a LOT of devices associated with each radio, you need more radios - upgrade to a single AP that supports more clients [more radios, massive spatial streams and MU-MIMO or WiFi 6 OFDMA RUs, etc], or, better, add more APs with non-overlapping channels. When you need to cover VERY dense deployments, you'll need specialised design guides (like for stadium wireless, or in places like lecture venues). </div><div><br /></div><div>Putting more APs in, with a careful manual channel plan, narrower bandwidth and power adjusted WAY down (as well as making use of now useful radio attenuation by building materials) will allow you to re-use channels quite quickly. The density of APs needed to provide rock-solid coverage can be quite startling to the unprepared (the Meraki guide for some Cisco offices as an example is... interesting). At that stage, you're definitely into the "model in software, deploy, and verify with post-installation survey" territory. </div><div><br /></div><div>Eventually, just being associated takes up all the available capacity of an AP - so you might have 100 or whatever clients associated to an AP, but they can't <i>do</i> anything useful! As with many things, a pinch of salt when looking at "maximum" numbers is always useful. </div><div><br /></div><h1 style="text-align: left;">High Density Coverage / Design</h1><p>How we use wireless, and what our users expect (nay, demand!) has massively changed in the past 20 years or so. When first designed, we were occasionally sending or receiving the odd 100kB or so email or webpage. I've see customers with wireless systems deployed literally 15+ years ago not getting the need to upgrade. Nowadays, we expect streams that may be tens of megabits per second to be flawlessly delivered to many devices - and most of us carry several of these around at any time. This is a HARD ask of wireless. </p><p>Designs from that era attempted to cover areas in as few APs as possible - but what we mean by "cover" these days has changed rather a lot. </p><p>If you're still dealing with wireless equipment or layouts from that bygone era, you're going to be well placed to think about redoing the wireless under the modern "high density" coverage paradigm. It's not literally "one AP per room" (some rooms also need several!), but that can (in some environments) not be a bad first stab approximation of the scale of the problem. Corridor WiFi is also the worst design. Most vendors have a high density design guide - reading that would be a good use of your time. </p><p>In no particular order or endorsement some vendor guides: </p><p></p><ul style="text-align: left;"><li><a href="https://documentation.meraki.com/Architectures_and_Best_Practices/Cisco_Meraki_Best_Practice_Design/Best_Practice_Design_-_MR_Wireless/High_Density_Wi-Fi_Deployments">Meraki</a></li><li><a href="https://support.ruckuswireless.com/documents/1345-best-practices-design-guide-high-density-wi-fi-ap-deployment">Ruckus</a></li><li><a href="https://help.ui.com/hc/en-us/articles/115002806907-UniFi-High-Density-WLAN-Scenario-Guide">Ubiquiti</a></li></ul><p></p><p><i>Other vendors exist - I've just happened to use those systems and read those guides, or former versions of them, in the past. </i></p><p>Make sure you understand the needs your end users will have - what apps, devices and duty cycles are involved; how many per user; where those users are and so on. If you can afford it, there are wireless design tools that will let you model (and then validate, with a post-installation survey) a site design. Ekahau have some lovely tools here, but they (again) are not the only vendor. </p><p><br /></p><h1 style="text-align: left;">Don't Set It And Forget It</h1><p>Many people assume that the vendors RRM (Radio Resource Management) is MUCH cleverer than it is. For simple sites and undemanding uses, it might be "good enough", but it will NEVER beat an intentional design and operational tweaks by someone who knows what they are doing. It can also have a bad day, and is constrained by decisions you take (AP placement, various settings around permitted channels, bandwidths and transmission power). </p><p>If you're having problems, you've probably gone beyond what that RMM solution can offer you - use this guide, and further reading, to figure out what a better set of settings might be - usually, you can get quite far with manual adjustments to things like: </p><p></p><ul style="text-align: left;"><li>what channels an AP is using (and often, switching some 2.4GHz radios off completely)</li><li>what channel width an AP is using (20 vs 40 vs 80 vs 160 MHz) - anyone using anything other than 20Mhz in 2.4GHz should be <strike>shot</strike> persuaded to stop.</li><li>What TX strength is set - turning power down can often help a lot of things (if you have enough APs)</li><li>Settings like minimum client data rate; disabling legacy protocols/rates</li><li>vendor specific tweaks</li><li>802.11 r/k/v (and any further similar new fun toys as they come out)</li></ul><div>Get a piece of paper out, sketch the site and AP locations, and come up with a manual channel plan, using information from your AP scanning to understand where particular frequencies may best be employed. </div><div><br /></div><div>In many cases, you'll need to walk around the site to truly get a "feel" for it. Ideally, do that with a suitable survey tool like an Ekahau sidekick, but you can get quite far with a laptop or smartphone and some free software and paying attention to signal strengths.</div><div><br /></div><div>Talking of RMM and AI/ML, I hear good things about Juniper's Mist wireless system with the Marvis AI. I've not played with it myself. I once asked a company in a job interview "does it live up to the hype?" and they were pretty effusive in praise for it. They're (very) demanding wi-fi users, given what they do, so that's one to add to your list if you're evaluating vendors. </div><div><br /></div><p></p><h1 style="text-align: left;">Duplex? Not just for printers. </h1><div>One thing a lot of people forget about wireless is it is half duplex or "simplex". Something can either be transmitting OR receiving, not both at the same time. </div><div><br /></div><div>This further means that if ANY device associated with an AP is transmitting or receiving, no other device can validly do so in a given time period. </div><div><br /></div><div>Another related and particularly pernicious problem is that of a "hidden node", where for example two clients on opposite ends of the room to each other can't "hear" that the other is busy broadcasting, so it'll throw out signal, swamping the other node's transmission. This is more common where you have excessively "loud" APs that have TX set too high (and VERY common in outdoor PtMP wireless). There are <a href="https://en.wikipedia.org/wiki/Hidden_node_problem#:~:text=In%20wireless%20networking%2C%20the%20hidden,are%20communicating%20with%20that%20AP.">knobs to help deal with this</a> on many platforms. </div><div><br /></div><div>This set of challenges puts very significant performance penalties on wireless in general, and yet again underscores the need to get things off wireless when you can. </div><div><br /></div><div>Another thing to be aware of is that, therefore, single very slow clients drag down the entire network performance (a device connecting at 1Mb/s will take up a lot more "airtime" for and equal amount of data send or received vs. one at a more sane connection speed) - one reason minimum data rates can be <i>really</i> helpful in tweaking performance. In high density situations, you may find that setting some APs to permit somewhat crappy connections helps extend your coverage, and then having some others be really demanding about minimum speeds keeps the good citizens up and running quickly - but this is super site specific and requires careful consideration and iterative tweaking. </div><div><br /></div><div><br /></div><h1 style="text-align: left;">OOOH, you think you're so Spatial!</h1><div>Spatial Streams. </div><div><br /></div><div>What? </div><div><br /></div><div>You'll often see things on AP specs about "spatial streams". Or at least you'll have noticed specs like 2x2:2 or 8x8:8 - this refers to the available number of TX, RX antennas and the number of spatial streams they can handle. Without going into book length, antenna diversity plus multipath means you can effectively re-use some spectrum and multiplex signals for higher throughput. </div><div><br /></div><div>When you're assessing problematic wireless, pay attention to the spatial streams available on the APs in question - if you're needing to deal with very demanding clients or lots of them, they can make a real difference. If it's not obvious, "more is better" and a 2x2:2 is very basic. </div><div><br /></div><h1 style="text-align: left;">Hot Potato Wi-Fi</h1><div>Get it out! GET IT OUT! </div><div>The faster you can get data across the airwaves, the more things you can serve (all other things being equal). If you have a potential to do 300 Mb/s vs 100 Mb/s, you can serve roughly 3 times more client devices (under ideal conditions) under the 300 potential scenario. This is why you may want to use wider bandwidths, and why additional spatial streams are so very useful. You will need to balance the need for clients-per-radio against coverage requirements and channel planning to see whether or not you can use the higher bandwidth options or not. </div><div><br /></div><h1 style="text-align: left;">So Broadcast, Such Traffic. Wow. </h1><div>I've seen "internet background radiation" in the order of 4 megabits per second of garbage (BUM) cluttering up airwaves on large wireless LAN segments. Filter that nonsense! </div><div><br /></div><h1 style="text-align: left;">Get the right tool for the job</h1><div>Get modern APs; turn off shitty all-in-one residential gateways. Mount them in the right place. Have decent backhaul and connectivity and supportive services. </div><div><br /></div><h1 style="text-align: left;">Things to Avoid</h1><div><br /></div><h2 style="text-align: left;">Meshing</h2><p>Meshing sounds cool. It isn't. Use wires for backhaul; <b>don't</b> suck up precious spectrum on backhaul! YMMV in a home, of course, but professionally, get those APs wired in!</p><p><br /></p><h2 style="text-align: left;">"wifi extenders" and "boosters"</h2><div>Hopefully, you don't have customers that think these are helpful. They're not. Use additional APs that are wired back instead, because these devices either do mesh or halve your available bandwidth (because they have to retransmit everything you're sending through them - and if you have a chain of these things as repeaters, woe betide you). </div><div><br /></div><h2 style="text-align: left;">People</h2><div>No, I'm not being a stereotypical sysadmin who would find everything works perfectly... until the users/customers turn up. People block wifi - make sure you avoid them by suitable AP placement - people height doesn't work very well - higher up tends to work better (so ceilings, usually, rather than walls, or heaven forbid, skirting height). </div><div><br /></div><h2 style="text-align: left;">Excessive SSIDs...</h2><p>Most guides will tell you to use 3 or fewer SSIDs on any given AP. Where you need to hive different users off into different VLANs/subnets, use RADIUS assigned VLANs for this on a common SSID. That allows for a general 802.1X RADIUS SSID, a guest SSID and a 'terrible legacy device' SSID where you need one. <br />see also: http://revolutionwifi.blogspot.com/p/ssid-overhead-calculator.html</p><p><br /></p><h2 style="text-align: left;">....Buuuut don't necessarily rule out frequency banded SSIDs</h2><div>It can be useful with some vendors and some clients to have different SSID names on 2.4 and 5 GHz wireless, so you can selectively connect to one or the other (i.e. Staff_2.4 or Staff_5.8) - because these are in different frequency bands, even though you have (say) 6 SSIDs, you only have 3 per rario. </div><div><br /></div><h2 style="text-align: left;">Don't break the law</h2><div><ul style="text-align: left;"><li>Make sure your system is set to the correct regulatory domain (country)</li><li>Make sure you apply "indoor" settings to "indoor" APs, and "outdoor" settings to "outdoor" APs (these are frequently different)</li><li>Don't exceed EIRPs by adding aftermarket antennas without understanding the consequences and requirements. <br /><br /></li></ul></div><h2 style="text-align: left;">Renegade!</h2><p>If you work in environments were change control is done (you probably should) make sure any experimentation conforms to that (and, ideally, starts in a lab). Don't be that guy that breaks the wireless when Janice from Accounting is running Payroll (of course, one might ask why Janice is even doing payroll over wireless...). :) </p><p>If you are experimenting with settings, particularly if you're new to all this, I highly recommend you ONLY do it when you're in the building you're messing with so you can immediately check the results and also canvas other users in the building about resulting effects from any changes (for good or ill). Document things you change and results as you go along!</p><p><br /></p><h1 style="text-align: left;">Be Nice To Your Neighbours</h1><p>Although wireless is a classic tragedy of the commons, you should try to be nice to your neighbours (don't be a dick in how you place or configure your network) - and you may find that there is much mileage in politely engaging in dialogue with neighbouring network operators to establish a more equitable or functional overall wireless arrangement in an area. Establish or help to suggest useful rules of thumb around appropriate and equitable use of shared radio spectrum (you know, no-brainers like "use wires" and "use a sane channel plan" and "turn the radio down"). </p><p>Where you have customers that do things like lease property, encourage them to provide (excellent) wireless services, with custom settings to isolate each customer into their own wireless network over common (shared) centrally managed infrastructure. It's MUCH worse when everyone in a building does their own thing. In some areas, it may even be plausible to have lease conditions to NOT run competing wireless networks (however that may be defined), although enforcement may be a challenge. </p><p><br /></p><h1 style="text-align: left;">Check the firmware</h1><p>It is often worth trying newer firmware, particularly if the problem is with newer model APs, controllers or end user devices; there may be specific bug fixes. As always, upgrade based on release notes that suggest improvement to a known issue that gels with what you're seeing, or on the advice of the vendor through a support case (or due to a known security issue or critical feature upgrade). </p><p>Don't forget that some versions will make things WORSE - like the vendor upgrade that caused ALL our APs (something like 1,000 of them) to toss all our clients off the system every 10-15 minutes for some random amount of time. Not great, and it took them more than 6 months to admit to and fix the regression (not to mention MANY late nights in maintenance windows repeatedly showing them how broken it was). </p><p><br /></p><h1 style="text-align: left;">What if it's not even the Wi-Fi?</h1><div>Today, many users conflate the term "Wi-Fi" with "network connectivity" - heck, I've had tickets complaining about broken or missing "wifi cables" (yeah, I know), or calling any form of Internet connectivity "Wi-Fi". To be fair, in the age of pocket devices, wireless is all too often how things are accessed - and we need to just interpret what users say to what users are actually trying to report - as we always have, and always will. </div><div><br /></div><div>Don't forget to ascertain what the end user is trying to get done and blaming on "the Wi-Fi", when it might be something else (network-wide issues with DHCP or DNS; local switch misconfiguration; saturated uplinks; internet outage; service outage at some 3rd party application or service, incorrect credentials, misconfigured profile, etc). Similarly. "slowness" may not be something you can remediate if the remote resource is inherently slow, or has some limitation on the Internet path outside your control. The speed of light only travels so fast. </div><div>Your average end user doesn't understand ANY of that.</div><div><br /></div><div>Use your normal troubleshooting processes for EVERY problem. If all else fails, start at layer 1 and work up, end-to-end...!</div><div><br /></div><div>If you've been given incomplete info, go back to the source and ask them to SHOW you the problem. </div><div><br /></div><div>And you'll be amazed how often removing and re-creating a wireless profile helps when the user is "absolutely sure" the settings are correct. Rebooting a device (ideally power-cycling) solves a good chunk more. ("<i>Hello, IT; Have you tried turning it off and on again?"</i>) </div><div><br /></div><h1 style="text-align: left;">Some Things Are Just Broken</h1><p>Occasionally, there is literally nothing you will be able to do. </p><p>A specific client device may have a b0rked radio, driver or refuse to reconfigure their wireless settings or re-enter their credentials (yes, I've seen all of this), and there is NOTHING you can do to your wireless system. All you can do is point out the troubleshooting done, suggest alternatives (use a wire; try another device), or simply politely give up on that device / user. They may also have unreasonable expectations (such as workable wireless in areas that you don't actually cover, or particularly outlandish requirements in terms of numbers of devices associated and expected throughput). </p><p>Sometimes, there may be site-specific factors you cannot fix or control - some incredibly noisy source of RF interference which you cannot get turned off, or get some spectrum enforcement agency (FCC, OFCOM, ICASA or so on) to deal with. You'll need to use wires (or even fibre) in such circumstances. Another common issue is where neighbouring networks overwhelm parts of yours (ACI and CCI) - or, embarrassingly, your own channel plan is self-defeating. </p><p>You may have extensively documented the deficiencies in a wireless system and the organisation refuse to engage in an upgrade/refresh. You can't fix that. </p><p>Document and CYA (Cover Your Ass)! </p><p><br /></p><h1 style="text-align: left;">Remember, application is key</h1><p>No matter what you read here or elsewhere - if you absolutely have to support ancient devices (legacy things) that you cannot replace or wire up, then you may need to leave legacy protocols and data rates enabled, even if that drags down your whole network. </p><p>You may find creating hyper-local hotspots for legacy gear useful (particularly where that legacy gear is static, as it tends to be) - dedicated APs next to the gear with the power turned right down. The rest of your wireless can then be tuned to modern settings. And of course, you may need serious conversations with management or customers around retiring and replacing legacy client devices - if you can prove legacy devices are making the Wi-Fi suck, they may be more willing to do something about it - but if it's a multi-million (insert currency) bit of OT gear, then they're also going to support you buying an additional AP just for that thing for your hyper-local hotspot concept instead (or buy that optional NIC for that kit you've been begging to get for <i>years</i>)!</p><p><br /></p><h1 style="text-align: left;">The common (near) failure mode of Wi-Fi is "SLOW"</h1><p>One parting thought - Wi-Fi is INCREDIBLY resilient. I sometimes think it would be better if it broke much earlier than it does. </p><p>The main symptom of pretty broken wireless is "slowness" - i.e. low throughput. </p><p><i>If wireless is slow, </i>something<i> is broken!</i> </p><p><br /></p><h1 style="text-align: left;">Read more</h1><p>If this has piqued your interest, there exist various books you could read. </p><p></p><ul style="text-align: left;"><li>You'll find a good basic to intermediate primer to be something like a CWNA book (I've owned a few copies of the 5th edition Sybex CWNA-107 study guide that have stood me in good stead) - more recent editions may now exist, so do check out the latest releases before you buy anything. The ISBN for that 5th edition is 978-1-119-42578-6.</li><li>Google the problem; read some other blogs on this; read modern design and troubleshooting primers. </li></ul><div><br /></div><p></p>James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com0tag:blogger.com,1999:blog-5949954951402579361.post-59695506509213454902020-09-29T21:06:00.011+01:002020-09-29T22:14:39.785+01:00A home for a home lab: StarTech 12U desktop open frame 2 post rack<p> My wife (aka "Senior Management") decreed that I needed "a table or something" for the network gear that was starting to accumulate in the spare room I use as an office. </p><p>Of course, since pretty much all of it either had rack mounting hardware or could be persuaded to sit on a shelf, a 19" rack made sense and forms a much more suitable home for network gear. </p><p>After hunting around a bit online, I decided a 12U open frame 2 post rack would probably do the job well - without being the imposing monolith of a 42U four post rack, wallbox or similar, which probably would get rather more in the way of raised eyebrows from Senior Management, too - if I could even fit it in the flat. </p><p><span></span></p><a name='more'></a> I eventually settled on a "desktop" 12U open frame two post unit from <a href="https://amzn.to/33bA8ZW">StarTech, the RK12OD</a>. <p></p><p>The product photo looked promising (yes, it really does lean back like that):</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhuEAJeLaZlGWC2aoIi6KNhL7hBXzv2Qi_XrxwZ4lKRYtR2Wlng5Y735xLZOTn9B4a_gef0O62i5m7aqqPZzG0SevCSqT9hIM3b0HsER-t-P2ADynKW7Nc4ghnBjMRaF4VWWqv7swM-Dkc/s400/rk12od.main.jpg" style="margin-left: 1em; margin-right: 1em;"><img alt="StarTech 12U two post desktop rack" border="0" data-original-height="400" data-original-width="400" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhuEAJeLaZlGWC2aoIi6KNhL7hBXzv2Qi_XrxwZ4lKRYtR2Wlng5Y735xLZOTn9B4a_gef0O62i5m7aqqPZzG0SevCSqT9hIM3b0HsER-t-P2ADynKW7Nc4ghnBjMRaF4VWWqv7swM-Dkc/w320-h320/rk12od.main.jpg" width="320" /></a></div>It was pretty easy to assemble, with only 8 screws required at the base; it's well engineered enough that you don't have to be too fussy about alignment to get the posts reasonably parallel with each other. It comes with 20 cage nuts and screws, but I bought 50 off eBay. It's almost certainly intended for fairly light 19" gear, like musician's sound modules and other A/V gear, but it seems to be happy enough with networking gear in it, and doesn't look like it's about to crack under the strain. Obviously, if you want to mount something heavy or something that takes a rail mount, you really ought to get a four post rack or cabinet. <div><br /></div><div>Note that the product does actually have a backwards "lean" to it, but I've not noted any propensity to tip backwards - but be aware that this lean may make things on shelves have a propensity to migrate backwards; this can be easily solved with something like self-adhesive velcro. You ought to put heavier gear towards the bottom, in common with all racks. One day, I may re-rack with that in mind, but there doesn't seem to be any stability issue at all with the gear in there now.<div><br /></div><div>As second hand Juniper SRX mounting brackets seem to be more expensive than second hand Juniper SRXs, I made do with shelves for the Juniper gear. It also leaves enough space for the power transformer for the SRXs and even a Raspberry Pi server on the shelves. With so little gear, "wasting" a U for a PDU wasn't a problem. It also means if you're clumsy, you're less likely to inadvertently stab yourself in the face with a rack post!</div><div><br /></div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgpUsynt-AP9CLFPKSWSGUjRUssldr8wVldQH3AKbpVXA7-N0DXWxM2C4XU5_EVTV2q6q_Si_QCouBy4ceQG0q_kV834sLvKYqhTrPv5QOlZzPYKi6Bgyw3EWXsqs2wiTvomYYXsKU9guE/s2543/20200918_230523.jpg" style="margin-left: auto; margin-right: auto;"><img alt="Rack with 3U populated; 2x Juniper SRX on shelves, 1x 19" PDU." border="0" data-original-height="2543" data-original-width="1236" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgpUsynt-AP9CLFPKSWSGUjRUssldr8wVldQH3AKbpVXA7-N0DXWxM2C4XU5_EVTV2q6q_Si_QCouBy4ceQG0q_kV834sLvKYqhTrPv5QOlZzPYKi6Bgyw3EWXsqs2wiTvomYYXsKU9guE/w156-h320/20200918_230523.jpg" width="156" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Early days - just two SRXs and power.<br />Note the slight backwards lean of the posts.<br />The SRXs actually have <a href="https://amzn.to/3cFMTPz">3M velcro "command strips"</a> under the front edge to stop them sliding backwards over time - or during ethernet cable insertion.</td></tr></tbody></table><div><div><br /></div><div>In common with many 19" posts, there are cage nut holes on the edge of the uprights, as well as on the front face. This comes in handy for attaching things like PDUs without using a U of space.<a href="https://amzn.to/348CGr0"> I found a basic PDU with surge suppression online</a>; after turning the mounting ears around 180° on the PDU, it was quite easy to attach it to the side with cage nuts - another one could easily be added to the other side, too - which is handy, because that PDU only has 6 outlets, and this is a 12U rack! If I find a need to power up more than 6 devices at the same time, I'll add another one to the other side - possibly one that takes IEC C13 plugs instead of the UK ones; after using them in datacentres, I am a fan, particularly in countries with bulky plugs (like the UK and South Africa). </div><div class="separator" style="clear: both; text-align: center;"><br /></div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEinNAWzeMWssR9FtxEXksHlZllHN5LDx6ZS77wogHSPuQxlEGkNhMcl4MkKMfzokGHiSEWY0J01GW-C5YuQMn8ePGvPQH_nSSzIWWWHxhyCRpPB993_Qb-u3-bxiqbR8nUCmOR4s_SsbKY/s2543/20200929_200422.jpg" style="margin-left: auto; margin-right: auto;"><img alt="6 outlet UK surge protected PDU mounted on the side of the rack." border="0" data-original-height="2543" data-original-width="1236" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEinNAWzeMWssR9FtxEXksHlZllHN5LDx6ZS77wogHSPuQxlEGkNhMcl4MkKMfzokGHiSEWY0J01GW-C5YuQMn8ePGvPQH_nSSzIWWWHxhyCRpPB993_Qb-u3-bxiqbR8nUCmOR4s_SsbKY/w156-h320/20200929_200422.jpg" width="156" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">One U saved - PDU moved to side of the rack.</td></tr></tbody></table><div class="separator" style="clear: both; text-align: center;"><br /></div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj2fk-xpVYI96xs-APF-uSskKPXJrzXODdRRKwUW9H7q8FhDalCosuCyTcoTfXvcbCUcJzM2fkIcW9LECFrwmHnhMAEAmoAw_dh0zvurmQ5tYR5eDhlOlxuKsNFFvelbVXgypYQrcHg3Uk/s2543/20200929_200447.jpg" style="margin-left: auto; margin-right: auto;"><img alt="Side of upright posts, showing cage nut holes." border="0" data-original-height="2543" data-original-width="1236" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj2fk-xpVYI96xs-APF-uSskKPXJrzXODdRRKwUW9H7q8FhDalCosuCyTcoTfXvcbCUcJzM2fkIcW9LECFrwmHnhMAEAmoAw_dh0zvurmQ5tYR5eDhlOlxuKsNFFvelbVXgypYQrcHg3Uk/w156-h320/20200929_200447.jpg" width="156" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Note the cage nut holes in the side of the upright - there are about a 19" unit width height of them (i.e if the unit is flat, you can mount it along here). I'd prefer more than two cage nuts holding things together, but it still seems solid enough for mounting things like PDUs.</td></tr></tbody></table><div><p>After I had 7U of eBay Cisco gear delivered today, there are only 3 U left - hence I decided to save some space and re-mount the PDU on the side, as seen above. Below, looking a bit cramped with the PDU at the top, displaced up by 2/3rds of a U to get the power cables to clear the top router - another thing that made me consider revising the mounting arrangements.</p></div></div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6tijgCZDjz67o_6RTOuBa0FJFP7g5e1O7wRS_AT5TU4P6u7TSuXnigIPw6igfKQdsn8oDjcXE7zfrp2ye-gywi36i50m3wVDV3CpwFDbAjA7rIsMus5DcJdClI6Rfkao46PwdClTomP8/s2543/20200929_163456.jpg" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="2543" data-original-width="1236" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6tijgCZDjz67o_6RTOuBa0FJFP7g5e1O7wRS_AT5TU4P6u7TSuXnigIPw6igfKQdsn8oDjcXE7zfrp2ye-gywi36i50m3wVDV3CpwFDbAjA7rIsMus5DcJdClI6Rfkao46PwdClTomP8/s320/20200929_163456.jpg" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Only 3 U left! <br /></td></tr></tbody></table>I was expecting the top three Cisco routers to have rack mount kits - they were pictured with them in the eBay advert - but apparently that was only "illustrative" of what they were sending out... Anyway, they were willing to part with 3 sets for a modest amount extra, so those should arrive shortly. I really don't like using shelves if I can avoid it - and I <i>really</i> dislike using other gear as a shelf!<div><br /><div>So, if you find yourself in need of a "home" for a modest home network lab, check out two post open frame desktop racks! StarTech make <a href="https://amzn.to/3n451aE">bigger (and smaller) units, too</a>.</div></div></div>James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com0tag:blogger.com,1999:blog-5949954951402579361.post-35250351580499210292020-09-23T15:54:00.008+01:002020-10-01T16:16:13.706+01:00Juniper Configuration Groups<p> Although the use of network automation / Infrastructure as Code is likely to greatly reduce their usefulness, <b>configuration groups</b> can be pretty handy things to include in your Juniper configuration. </p><p>You can use them for two major things: <br /></p><ol style="text-align: left;"><li>portable generic configuration you are likely to set everywhere on all your routers (something like DNS servers or NTP servers, or a management network gateway etc.), and </li><li>specific configuration that differs from Juniper's defaults, but is the common "base" configuration for that type of object in your network. </li></ol><div>The second tends to be the more useful.</div><div><br /></div><div>People that are not familiar with Junos may find this group inheritance behaviour surprising, and <i>surprising</i> is not a thing you really need in an operational network - so it's worth understanding configuration types in Junos that are commonly used, but not immediately obvious. </div><p></p><p style="text-align: left;">So, let's have a look at these handy magical blocks of group configuration!</p><span><a name='more'></a></span><p><br /></p><h1 style="text-align: left;">Some use cases...</h1><p>A major use at my last site on our distribution routers was ensuring that VLANs that went over (almost) all interfaces were consistently applied, ensuring they didn't get forgotten - and couldn't be inadvertently removed by issuing the wrong command; where we <i>didn't</i> want those for some specific reason, we could exclude specific interfaces from inheriting that configuration. Essentially, these VLANs were used for wireless access, and had to go to pretty much every building in that network sector - chances are, we wanted those VLANs on an interface, so there they were! Similarly, we used another one to ensure OSPF interfaces were passive - and over-rode that where we actually wanted neighbo(u)r relationships to form. </p><p>With some thinking, you can even use groups to define and apply a third type of group configuration, like different customer-facing services (wildcard groups are very useful here) - and when those services change, simply update the configuration in one place (in the relevant group), and it propagates through all the occurrences of that type of customer service (for instance a bandwidth allocation on a customer MPLS path for each level of service you sell). <br />You could achieve changing a load of identical values throughout a configuration with a <a href="https://www.juniper.net/documentation/en_US/junos/topics/topic-map/modifying-configuration.html#id-using-global-replace-in-the-junos-os-configuration">replace pattern</a> throughout a configuration - but manually (re)configuring things is more prone to human error that programmatic inheritance - and replace patterns have some gotchas - make extensive use of <span style="font-family: courier;">show | compare</span> if you use them!</p><p><br /></p><h1 style="text-align: left;">How do I know if groups are changing my configuration? </h1><p>One thing you may not realise is that configuration groups can be quite stealthy - they won't be output in an obvious way in <span style="font-family: courier;">show configuration</span>, except as a list of groups at the top, and some possibly not immediately obvious apply-groups configuration elements - unless you pipe display inheritance.</p><p>In other words:</p><span style="font-family: courier;"><blockquote>show configuration | display inheritance<br /><i><snip></i><br />unit 3 {<br /> encapsulation vlan;<br /> vlan-id 20;<br /> peer-unit 2;<br /> family inet {<br /> address 10.20.0.2/30;<br /> }<br /> ##<br /> ## 'mpls' was inherited from group 'mpls_lt'<br /> ##<br /> family mpls;<br /> }<br /><i></snip></i></blockquote></span><div>gives you a much better "situational awareness" of your router's configuration - without it, you might not realise the interface has the <span style="font-family: courier;">mpls</span> address family configured on it. In unfamiliar Juniper environments, bear inheritance in mind!</div><div><br /></div>If you find the comments tedious, add <span style="font-family: courier;">| except ##</span><br /><p><span style="font-family: courier;"></span></p><blockquote><span style="font-family: courier;">show configuration | display inheritance | except ##</span></blockquote><p></p><p>Note this then won't tell you where the config came from - but you<i> will </i>see the configuration that is applied to your system, including any inherited config.</p><p>You can also try show <span style="font-family: courier;">configuration | display inheritance | display set</span><span style="font-family: inherit;">,</span> although it's not the easiest to read - but you can still see which groups are being applied where.</p><p>You can also carefully examine the configuration itself to determine the groups that exist</p><p><span style="font-family: courier;"></span></p><blockquote><span style="font-family: courier;">show configuration groups</span></blockquote><p></p><p>and then check for apply-groups to see where they are applied: </p><p><span style="font-family: courier;"></span></p><blockquote><span style="font-family: courier;">show configuration | match apply-groups | display set</span></blockquote><p></p><p>which will give you the immediate stanzas including apply-groups (and apply-groups-except), which should give you enough context to see where they are applied - if you need greater context, you at least then know where in the configuration to look! </p><p>The third, really hard way of noticing this is that you note the output of operational commands implies the presence of configuration you never recall applying. In this example, that confusion might be that interfaces have <span style="font-family: courier;">family mpls</span> on them, but it doesn't seem to be configured at first glance, yet <span style="font-family: courier;">show interfaces terse</span> shows mpls!</p><p>Finally, in that vein, you should be aware that Junos applies a whole load of defaults silently - mostly these do not cause you issues, but you should know that:</p><p><span style="font-family: courier;"></span></p><blockquote><span style="font-family: courier;">show configuration | show inheritance defaults</span></blockquote><p></p><p>can display what these are and where they are applied.</p><p><br /></p><h1 style="text-align: left;">How do you create and use these groups? </h1><p>Simple, just name them and apply the config you want, using wildcards where appropriate. </p><p>A good example is setting something on an interface: </p><p></p><blockquote style="font-family: courier;">set groups mpls_lt interface <lt-*> unit<*> family mpls</blockquote><p><span style="font-family: inherit;">the base syntax is:</span></p><p></p><blockquote><span style="font-family: courier;">set groups <name> <config you want to apply> </span></blockquote><p>Next, you need to apply-groups to your configuration. </p><p>The least effort is to apply them to the entire configuration (<span style="font-family: courier;">set apply-groups <groups name(s)></span>), but that is not efficient. Instead, add apply-groups closer to where they will be applied - people are more likely to notice groups are in use that way, too. </p><p></p><blockquote><span style="font-family: courier;">set interfaces apply-groups [list_of your_groups about_interfaces]</span></blockquote><p></p><p> From the example group we configured a moment ago, this would be</p><p><span style="font-family: courier;"></span></p><blockquote><span style="font-family: courier;">set interface apply-groups mpls_lt</span></blockquote><p></p><p>What about if we want to <b>NOT</b> use a group on a particular bit of configuration? Easy! Set the <span style="font-family: courier;">apply-groups-except</span> setting to the relevant bit of your configuration. </p><p><span style="font-family: courier;"></span></p><blockquote><span style="font-family: courier;">set interfaces lt-0/0/0.1 apply-groups-except mpls_lt</span></blockquote><p></p><p>or more generically</p><p></p><blockquote style="font-family: courier;">set <some config item> apply-groups-except <one or more groups you want to exclude> </blockquote><p><span style="font-family: inherit;">If you want to get really fancy, you can also apply configuration conditionally, such as </span><span style="font-family: courier;">when</span><span style="font-family: inherit;"> the platform is a particular model:</span></p><p><span style="font-family: courier;"></span></p><blockquote><span style="font-family: courier;">set groups mx240_config when model mx240</span></blockquote><p></p><p>would apply whatever was in the group mx240_config to any device that was an MX240 - but it would be ignored on any other Juniper platform. </p><p><br /></p><p></p><p></p><p></p><h1 style="text-align: left;">Help! My commit time is horrid now!</h1><p>If you set apply-groups, it may take your router some time to now process all of the stanzas within your configuration, decide if it needs a change, and applying the change from the apply-group(s) you defined. </p><p>Remember, you can add <span style="font-family: courier;">set apply-groups [list of groups]</span> at the top of your configuration, but that would evaluate every single configuration stanza against every single group! Rather add the apply-groups to sub-sections of your configuration, such as listing only the apply-groups for interfaces under the [edit interfaces] section. <br />Fewer wildcard matches will also improve speed.</p><p>On some Juniper platforms, </p><p></p><blockquote style="font-family: courier;">set system commit persist-groups-inheritance</blockquote><p><span style="font-family: inherit;">allows the system to store the results of some of the applied groups with wildcards between commits (almost "precompiled" if you will), speeding the evaluation of the groups and therefore improving commit times. You may also like to be more specific about where you apply-groups, and which groups you apply, if you haven't already done that! </span></p><p>When we upgraded from MX10 edge routers to MX204, one thing we noticed was a massive improvement in not only BGP convergence, but near-instantaneous commits - slow commits can be really annoying, so it's worth spending a little time to make sure you're not doing your configuration in a way that's likely to drag them out unnecessarily. The improved commit time was mainly down to much more CPU power (1.33GHz PowerPC CPU vs. 1.6-GHz Intel 8 Core X86 CPU), but a side-order of solid state storage probably didn't hurt either. I offer this tale to illustrate that slow commits are annoying - particularly once you've experienced near instantaneous ones...!</p><p><span style="font-family: inherit;"><br /></span></p><h1 style="text-align: left;"><span style="font-family: inherit;">Further reading</span></h1><p></p><ul style="text-align: left;"><li><span style="font-family: inherit;"><a href="https://www.juniper.net/documentation/en_US/junos/topics/topic-map/configuration-groups-usage.html">Configuration groups usage (Juniper)</a> will give you a good overview, and some ideas of how you might find them useful. </span></li></ul><p></p><p><br /></p><p></p>James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com0tag:blogger.com,1999:blog-5949954951402579361.post-31897632928621458332020-09-14T23:13:00.036+01:002020-10-02T14:13:30.525+01:00Juniper Home Lab - virtual lab topology on a single physical device<p>One of the things I've wanted for a long time is a few Juniper devices lying around my home to keep my Juniper CLI skills up to scratch and to experiment with new concepts as I learn them. Sure, you can run great labs in things like <a href="https://www.eve-ng.net/index.php/documentation/howtos/howto-add-juniper-vmx-16-x-17-x/" target="_blank">EVE-NG</a>, but you ultimately need licensed VM images (and a machine with a fair amount of RAM and CPU grunt for any complex topologies), and those licenses are quite expensive (although if you have a juniper.net login, you can download a free 60 day evaluation copy of <a href="https://support.juniper.net/support/downloads/?p=vmxeval#sw" target="_blank">vMX</a> router, <a href="https://www.juniper.net/us/en/dm/free-vsrx-trial/" target="_blank">vSRX</a> firewall or <a href="https://www.juniper.net/us/en/dm/free-vqfx-trial/" target="_blank">vQFX</a> switch; apparently, you can simply recycle trial licenses - not that this is recommended [see e.g. page 351 of the <a href="https://www.juniper.net/documentation/en_US/day-one-books/junos-beginners-guide.pdf" target="_blank">Junos beginner's Day One guide</a>]). Indeed, even if you (re)use trial licenses, you'll probably need a fairly hefty - expensive - server to run them on, which will cost a similar amount to quite a lot of second hand devices; the key advantage of the former, perhaps, is you're more likely to have a more current Junos image to work with in the virtualised space as opposed to from the old second hand gear market.</p><p>Old second hand Juniper gear, however, is quite cheap, and although you won't get support or upgrades, will then also not cost you any ongoing support. I'd be very wary of downloading random Junos images off the internet - some people do seem to share them if you look hard enough. I ordered two SRX 110H2-VA routers off Ebay to scratch this itch (I further scratched this itch with <a href="https://schoolsysadmin.blogspot.com/2020/09/a-home-for-home-lab-startech-12u.html">more gear</a>...). I don't need anything particularly fancy, and these units are quite cheap, fairly compact, and lack fans, so they are nice and quiet. There are a lot of basic SRX firewalls available online; as Junos is somewhat consistent across most of the platforms, you will find getting a thing marketed as a "firewall" also lets you learn most of the Juniper platform features for not only firewalling, but switching and routing too from across their portfolio - aside, of course, for those features not supported on this platform or software version. </p><p>By the end of this post, you should be able to create a single router that has 8 virtual routers configured on it with a fairly complex, but easily understood, topology. </p><p>Read on for some ideas... </p><span><a name='more'></a></span><p><br /></p><h1 style="text-align: left;">"Out of Box" experience</h1><p>The SRX units were factory reset by the Ebay seller (in Amnesiac mode) and already upgraded to the latest stable 12.x train available (12.3X48-D85.1) for these now somewhat obsolete units. You need a valid support contract to be able to download the Junos images for a given device, so it's worth checking that the units you buy are at least somewhat up-to-date. If you are undergoing certifications, you may need to use particular versions of Junos (and possibly particular hardware), as the features and commands can (and do) change a little between versions and platforms. </p><p>They do take a few minutes to boot up after you apply power, so don't be too impatient!</p><p>One thing I'll note is that they get remarkably warm - particularly if you stack them on top of each other. I don't recommend doing that as a result, which is a shame (the <a href="https://www.juniper.net/documentation/en_US/release-independent/junos/topics/reference/requirements/services-gateway-srx110-clearance-airflow-requirement.html" target="_blank">hardware guides</a> don't specifically say not to do so). I'll probably order a small rack or something to house them in eventually. They are apparently rated up to about 40 degrees Celsius ambient temperature (it is low 20s around here at the moment), but will probably thermally throttle or shutdown before getting anywhere near that if they bake each other at the same time. They're clearly meant for desktop use as a single unit (or as a single unit in a roomy, not particularly full rack, if you have a rack mount kit). As a lab device and not a production device, turning them off is probably a good idea - they'll last longer if kept cooler, and they will save you power, so long as you also unplug the power supplies. If you don't have a small rack, a very large desk or separate table to use as a workspace is quite useful to construct your topologies.</p><p>You're <a href="https://www.juniper.net/documentation/en_US/release-independent/junos/topics/task/operational/services-gateway-srx110-grounding.html" target="_blank">supposed to ground the devices</a>, although if you're not using ADSL, it probably won't be the end of the world if you don't; if you're installing them for production use, however, follow the manufacturer's instructions - or your site-specific requirements. </p><p>Although you can achieve a lot through the web GUI or an SSH connection to the device, I recommend having a "<a href="https://en.wikipedia.org/wiki/Rollover_cable" target="_blank">yost</a>" or rollover serial cable for management purposes. You can get USB ones cheaply now. </p><p><br /></p><h1 style="text-align: left;">Junos crash course</h1><p>I've been using Juniper's Junos since <a href="https://schoolsysadmin.blogspot.com/2016/06/juniper-ex4600s-first-impressions.html" target="_blank">2016</a>, so I'm reasonably familiar with it - although if you're not, you will find Juniper has spent a lot of time and effort putting together learning resources like the <a href="https://www.juniper.net/documentation/" target="_blank">tech library</a>; the <a href="https://www.juniper.net/documentation/jnbooks/en_US/day-one-books/" target="_blank"><i>Day One</i> books on various topics</a> are particularly good, and available as free PDFs, as well as for a nominal cost as Kindle e-books - or as print-on-demand hardcopy through <a href="https://store.vervante.com/c/v/category_order.html?base_cat=Juniper%20Networks%3aShop%20Day%20One%20Books&pard=juniper&id=oKakGIXx" target="_blank">Vervante</a> (or their presence on the likes of Amazon). The <i>Day One</i> guides are typically around 100 pages or so, allowing you to very quickly get up and running on a platform or technology. You should recognise that there is a <i>lot</i> more to learn (I have hundreds of pages of Juniper reading to plough through on my bookshelf), but it gives you some "quick wins" and a solid introduction to the topic. </p><p>You can also make use of the extensive courses and learning materials on <a href="https://cloud.contentraven.com/junosgenius/login" target="_blank">Junos Genius</a>; until recently at least, you've been able to complete a course and get a free voucher for their five Associate level certification exams (JNCIA Junos, Sec, DevOps, Cloud & JNCDA); I think this has been popular enough that this now only nets you a 75% discount - although a free digital training course is still a nice touch. </p><p>A commonly encountered difference is between those platforms using "ELS" command syntax and the older ones that don't - to me the most obvious manifestation of ELS vs. non-ELS is what the logical routing Layer 3 interface you put an IP address on a VLAN for ethernet-switching is called. If it's called "vlan.<id>", then it's the old non-ELS syntax; if it's called "irb.<id>" then it is ELS - but there are other differences. Fortunately, Juniper is quite good at giving you very helpful contextual clues by hitting the ? key where you're not certain where to go next (not to mention the builtin help with the <span style="font-family: courier;">help apropos</span> <keyword>) - for the most part, there are few surprises about where you would need to go within the configuration hierarchy, particularly after a few days or weeks of playing around in the ecosystem. Another big difference is between those platforms that still make use of FreeBSD as the underlying OS, and those which are now using a version of Linux as the host, and basically run Junos as a VM within that; day to day, that makes fairly little operational difference - it's mainly noticeably different when it comes to upgrading Junos versions during maintenance windows and the slightly different ways this is done between the two host OSs. Whilst Juniper have long claimed a "one Junos" (where all the commands are hypothetically uniform across the entire hardware range), this is starting to break down a little, so do be prepared for the odd difference between different Juniper hardware platforms and software versions; another, perhaps unsurprising, difference is configuring bridges and VLANs on things that are "switches" (<a href="https://www.juniper.net/documentation/en_US/junos/topics/reference/configuration-statement/family-ethernet-switching-edit-interfaces-qfx-series.html" target="_blank">EX series</a>) vs things that are "routers" (<a href="https://kb.juniper.net/InfoCenter/index?page=content&id=KB27291&cat=MX40_1&actp=LIST" target="_blank">MX series</a>). </p><p>Having a physical device (or preferably several) as you learn things is really quite helpful. Although you can make use of the <a href="https://jlabs.juniper.net/vlabs/" target="_blank">free vLabs platforms</a>, it is often nicer to have complete freedom to configure things as you wish - and without time limits. </p><p>Download a few of those resources, or sign up to the courses that look most interesting; start with the JNCIA-Junos course for a solid grounding and go from there. I'm going to assume you're going to embark on your own learning journey here, and will explore the CLI in your own time, so I don't cover all the details below. </p><p>In the rest of this post, I'll suggest a few things you might want to play with in your lab.</p><p>See also: </p><p></p><ul style="text-align: left;"><li><a href="https://www.juniper.net/documentation/en_US/day-one-books/junos-beginners-guide.pdf" target="_blank">Day One: Beginner's Guide to Learning Junos</a></li><li><a href="https://www.juniper.net/documentation/en_US/day-one-books/ExploreJunosCLI_2ndEd.pdf" target="_blank">Day One: Exploring the Junos CLI 2nd Edition</a></li><li>Those very familiar with Cisco's platforms will find <a href="https://www.juniper.net/documentation/en_US/day-one-books/JunosForIOSEngineers.zip" target="_blank">Junos for IOS Engineers</a> quite useful!</li><li><a href="https://www.juniper.net/documentation/product/en_US/srx110" target="_blank">SRX110 documentation</a> (there is a LOT of info here). </li></ul><div><br /></div><p></p><h1 style="text-align: left;">Flow and Packet Mode</h1><p>The SRX platform is intended to be a stateful firewall. Out of the box, this makes using them as routers a little harder than it needs to be. The SRX110 ships with a configuration that make the first port (fe-0/0/0) and the ADSL port "WAN ports" with a DHCP client, and members of the "untrust" firewall zone. The rest of the ports are bridged together in a VLAN with a DHCP server which can also be used to managed the device, in the Trust zone, with NAT set up out the WAN port, and a basic permissive outgoing firewall ruleset. This makes it work like a lot of consumer "routers" - plug the WAN port into something with a working Internet connection, and connect your PC to one of the "LAN" ports, and it will "just work" as a stateful firewall (although if you're using the ADSL port or <a href="https://www.juniper.net/documentation/en_US/release-independent/junos/topics/concept/3g-usb-modem-srx110-overview.html" target="_blank">3G USB modem</a> port, you'll need to do some config to get them to work). </p><p>If you're trying to learn routing on Junos, get the firewall features out of the way (obviously in production you keep firewalls<i> firmly </i>in the way!). The stateful firewall mode is known as "flow" mode (because, of course, a stateful firewall keeps track of connections - <i>flows</i> - of traffic); the alternative is known as "packet" mode, as it treats each packet separately, as a typical router would. </p><p>An SRX in flow mode looks like this when you issue the <span style="font-family: courier;">show security flow status</span> command:</p><p><span style="font-family: courier;"></span></p><blockquote><p><span style="font-family: courier;">root> show security flow status</span></p><p><span style="font-family: courier;"> Flow forwarding mode:<br /></span><span style="font-family: courier;"> Inet forwarding mode: flow based<br /></span><span style="font-family: courier;"> Inet6 forwarding mode: drop<br /></span><span style="font-family: courier;"> MPLS forwarding mode: drop<br /></span><span style="font-family: courier;"> ISO forwarding mode: drop<br /></span><span style="font-family: courier;"> Flow trace status<br /></span><span style="font-family: courier;"> Flow tracing status: off<br /></span><span style="font-family: courier;"> Flow session distribution<br /></span><span style="font-family: courier;"> Distribution mode: RR-based<br /></span><span style="font-family: courier;"> Flow ipsec performance acceleration: off<br /></span><span style="font-family: courier;"> Flow packet ordering<br /></span><span style="font-family: courier;"> Ordering mode: Hardware</span></p></blockquote><p><span style="font-family: courier;"></span></p><p>You may find things like <span style="font-family: courier;">MPLS forwarding mode: drop</span> cramp your style!</p><p>Putting most SRX devices into packet mode is a question of two commands in configure mode:</p><blockquote style="border: none; margin: 0px 0px 0px 40px; padding: 0px;"><p style="text-align: left;"><span style="font-family: courier;">delete security</span></p></blockquote><p>to remove the flow security filters, and</p><blockquote style="border: none; margin: 0px 0px 0px 40px; padding: 0px;"><p style="text-align: left;"><span style="font-family: courier;">set security forwarding-options family mpls mode packet-based</span></p></blockquote><p>to set up packet mode and then, of course, a <span style="font-family: courier;">commit</span> and a <span style="font-family: courier;">request system reboot</span>. </p><p>If you issue the <span style="font-family: courier;">show security flow status</span> command after doing that - but before rebooting - you'll notice the output changes a bit: </p><p><span style="font-family: courier;"></span></p><blockquote><span style="font-family: courier;">root@srx2> show security flow status<br /> Flow forwarding mode:<br /> Inet forwarding mode: flow based (reboot needed to change to packet based)<br /> Inet6 forwarding mode: drop<br /> MPLS forwarding mode: drop (reboot needed to change to packet based)<br /> ISO forwarding mode: drop<br /> Flow trace status<br /> Flow tracing status: off<br /> Flow session distribution<br /> Distribution mode: RR-based<br /> Flow ipsec performance acceleration: off<br /> Flow packet ordering<br /> Ordering mode: Hardware</span></blockquote><p></p><p>(I also set the hostname, hence the @srx2 in the prompt).</p><p>After rebooting, you'll have: </p><p><span style="font-family: courier;"></span></p><blockquote><span style="font-family: courier;">root@srx2> show security flow status<br /> Flow forwarding mode:<br /> Inet forwarding mode: packet based<br /> Inet6 forwarding mode: drop<br /> MPLS forwarding mode: packet based<br /> ISO forwarding mode: drop<br /> Flow trace status<br /> Flow tracing status: off<br /> Flow session distribution<br /> Distribution mode: RR-based<br /> Flow ipsec performance acceleration: off<br /> Flow packet ordering<br /> Ordering mode: Hardware</span></blockquote><p>In particular, you will find it much easier to work in packet mode if you're learning features that require packet mode handling of packets to work; in flow mode, you may keep hitting barriers to learning that are not because you misconfigured something, but rather because you haven't managed to get the particular packets you need to be processed in packet mode instead of in flow mode. </p><p>Whilst you can make use of "selective stateless packet based services" interface packet filters to force those packets that need to be handled packet by packet in packet mode, you may be better served learning the underlying technology first (in packet mode), and knowing that you can make it work without complicating it with a bunch of special exceptions for traffic that needs "special" handling (in flow mode). In other words, learn the routing protocols you want to learn, learn Juniper's stateful SRX firewalling, and then stateless interface packet filters - and only then combine them once you're confident with all of them, instead of wondering whether you've messed up something with the protocol - or just not got the filters right. Start simple,<i> then </i>complicate! </p><p>If you later want to learn the firewall features AND making them work with packet mode protocols, you can always just revert back to factory settings and go from there - or change the configuration to remove the packet mode configuration and put back stateful flow mode rules. You'll probably have to add the host-inbound-traffic configuration for the protocols you want to work on, too within each relevant zone. </p><p>see also: </p><p></p><ul style="text-align: left;"><li><a href="https://kb.juniper.net/InfoCenter/index?page=content&id=KB30461&cat=JUNOSV_FIREFLY&actp=LIST" target="_blank">[SRX] How to change forwarding mode for IPv4 from 'flow based' to 'packet based'</a></li><li><a href="https://www.juniper.net/documentation/en_US/junos/topics/topic-map/security-packet-based-forwarding.html" target="_blank">Packet-Based Forwarding</a></li></ul><div><br /></div><p></p><h1 style="text-align: left;">How many should I buy, and which ones?</h1><p>SRX devices are common in home labs for people that work with Juniper gear - they seem to be widely available and quite cheap second hand. You want at least two, although three or more is probably better, because you can construct more complex topologies - but you can get away with just one if you're happy to deal with the initially mind-bending nature of a virtualised topology within a single router - stay tuned for how to do this a bit later in this post. Another bonus of physical units is it is pretty easy to simulate link failures - just pull the cable out; that can be more of a challenge in virtualised environments (but is pretty easy - disable the interface).</p><p>If you want to try inter-vendor compatibility, buy at least one router from another vendor, perhaps a Cisco if you want to be pretty "industry standard", or something else if you want to save some money - many Mikrotik routers can be bought new for less money than a 5 year old "enterprise" router. I learnt a lot of my early networking on Mikrotik 750 units (hEX lite is the current equivalent; hAP lite are even cheaper) - but they have their quirks. </p><p>If you want to experience more platforms in the Juniper range, then get other devices (perhaps two EX series that can do Virtual Chassis, and at least one MX router; QFX are pretty new, high end and therefore expensive). Ideally, make sure the platform you're buying is still somewhat supported and has a reasonably recent version of Junos installed (preferably the version currently being targeted in certifications for said platform). Unfortunately, the EX series switches are still quite expensive, and some of the basic modes don't support virtual chassis (VC) - and some of the mid range ones need accessory modules to configure VC (like EX4200s need an EX-UM-2X4SFP). </p><p>If you are trying to learn the concepts, buying EoL/EoS units is not a problem. However, if you're trying to experiment and learn with specific current technologies not available in the older platforms, or latest versions of the operating system, you may find this cramps your style (particularly if they aren't available in a <a href="https://jlabs.juniper.net/vlabs/">vLab</a>) - but if you're doing that, hopefully your employer will supply you with what you need to demonstrate the concept in a work lab, because it's going to cost you a huge amount to build for yourself at home!</p><p><br /></p><h1 style="text-align: left;">A suggestion...</h1><p>Although I will give a lot of configuration below that you could just copy and paste, I <i>strongly</i> recommend that you come up with your own topology, numbering scheme and ideas - and manually type the commands in yourself. I know from my own personal experience that it is MUCH easier to learn (and remember) the command syntax that way - and you'll understand what and why configuration is done in a particular way much better - if you do it yourself, line by line. Words in a book (or blog!) or config copy/pasted - or existing production routers you look at the config for - will <i>never</i> teach you as much as actually implementing things from scratch on you own. By all means start out by copy/pasting "working" configurations to get a feel for a platform or topic - but you will ultimately get more value out of truly doing it yourself - designing the topology, deciding on various architectural choices, and implementing - and testing (and even documenting) - the entire configuration yourself. </p><p>In terms of how you ought to proceed, arguably, whichever way your brain best "gets" it - but my suggestion is to start with a business need, plan for the design you're trying to do to meet that need, then creating all the necessary interfaces with required addresses, assigning them to routing instances as needed, then adding in the various protocols, filters, and of course, verification/testing. </p><p>In other words the suggested basic steps are:</p><p></p><ol style="text-align: left;"><li>identify needs</li><li>design (decide what you want to achieve; make an annotated network diagram / sketch)</li><li>create interfaces and physical (or logical) interconnections</li><li>assign IP addresses</li><li>create (virtual) routers</li><li>assign interfaces to virtual routers, if necessary</li><li>set up routing protocols or static routes, assigning to virtual routers if necessary.</li><li>set up any additional configuration you want to learn about (many routing protocols need route filters, or you may need to "leak" routes between routing instances)</li><li>verification - does it work?</li><li>documentation - cement your learning by writing it all down.</li></ol><div><br /></div><p></p><h1 style="text-align: left;">Virtual Routers and Interfaces</h1><p>Of course, if you only have a few routers (or just one), you might like to try creating additional virtual routers inside the one(s) you have. This also helps if you're struggling to run big topologies of virtualised routers in something like eve-ng on a modest machine. On some of the higher end Juniper platforms, you can have literally thousands of virtual routers. Obviously, on more basic routers, you're not going to try carrying entire Internet routing tables many times over (or at all!) over tens or hundreds of virtual devices - a few illustrative prefixes are all you need to learn a platform and experiment. </p><p>Don't have enough physical ports? You can use sub-interfaces and VLANs or even logical tunnel interfaces. In this post, I'm going to pretend I only have a single SRX and need to create a topology between several routers - on a single device. </p><p>Note that you should be quite far along in your understanding Junos journey before you start playing with these, as it can get quite confusing (use your network diagram!), but as you grow out of the basic lab scenarios and want to create something a bit more complicated (crazy?) and don't have a lot of gear, this is definitely a route to explore. It will help you greatly to design something on paper, and then figure out how to make it work, annotating the diagram as you go (with interface names and IP addresses - and any other information you find useful). </p><p>Obviously, there are limits to how many of these you can make on the lower end platforms (and even the high end platforms) - so you're certainly not going to simulate the entire internet or a large campus LAN or enterprise WAN on a single device. You're likely to find the number is surprisingly high, although performance might not be stellar, it will be good enough to learn on!</p><p>Before you start, draw your intended topology out - consider putting boxes grouping the physical and virtual routers belonging to each actual unit you have, so you know "where" each unit is and what it is called. Label the physical or virtual interconnections (ports, VLANs, or any other interfaces) and IP addresses between them, too. </p><p>Aside from the typically licensed feature of <a href="https://www.juniper.net/documentation/en_US/junos/topics/topic-map/security-logical-systems-for-routers-and-switches.html">logical-systems</a>, there are two major types of router virtualisation you'll encounter - <span style="font-family: courier;">routing-instances instance-type vrf</span> and <span style="font-family: courier;">routing-instances instance-type virtual-router </span><span style="font-family: inherit;">(there are of course <a href="https://www.juniper.net/documentation/en_US/junos/topics/reference/configuration-statement/instance-type-edit-routing-instances-vp.html" target="_blank">a few others</a> as well as <a href="https://www.juniper.net/documentation/en_US/junos/topics/topic-map/junos-node-slicing-overview-topic-map.html">Node Slicing</a>!)</span>. In this context, you typically want virtual-router; vrf is more commonly deployed in service provider scenarios involving L3 VPNs; in my head I file them as "VRF is for service providers wanting to segregate multiple client VPN networks; virtual-router is when I want to segregate my own network - or pretend I have more routers than I really do". As well as the handy "make more routers for my lab" potential of these, they're quite useful in the real world, too - at my last job, we used virtual-router quite extensively across our core and distribution routers to segregate a campus network into different segments that only "met" on our edge routers and had to traverse the firewall (both ways, so the policies relevant to each segment applied, as necessary) to do so, even though they shared physical infrastructure, they were logically segregated with routing instances. </p><p><br /></p><h2 style="text-align: left;">Creating virtual-routers</h2><p>Creating a virtual router in Junos is very easy. The syntax is:</p><p></p><blockquote style="font-family: courier;">set routing-instances <name> instance-type virtual-router </blockquote><p><span style="font-family: inherit;">for example, to create one called VR1: </span><span style="font-family: courier;"><br /></span></p><p></p><blockquote style="border: none; margin: 0px 0px 0px 40px; padding: 0px;"><p></p><p style="text-align: left;"><span style="font-family: courier;">set routing-instances VR1 instance-type virtual-router</span></p><p></p></blockquote><p></p><p><span style="font-family: inherit;">You should get into the habit of always adding loopback interfaces and IP addresses to routers - virtual routers are no different than "real" ones in this regard! </span></p><p><span style="font-family: courier;"></span></p><blockquote><span style="font-family: courier;">set interface lo0 unit 1 family inet address 10.255.2.1/32<br /></span><span style="font-family: courier;">set interface lo0 unit 2 family inet address 10.255.2.2/32</span></blockquote><p>We'll associate these interfaces with the virtual router in a later step. </p><p>Furthermore, you should get into the habit of specifying the router-id used in routing protocols as the loopback IP address. If you don't, Junos will pick one of the configured router IP addresses as the router-id, typically either the IP address of the loopback, or the lowest numbered IP on an interface if a loopback IP is not configured. This may not be what you want; operationally, it is VERY useful to have the router ID and the loopback IP match, so you can easily - and consistently - determine which router is sending what. Setting this is pretty easy. </p><p></p><blockquote style="font-family: courier;">set routing-instances VR1 routing-options router-id 10.255.2.1<br />set routing-instances VR2 routing-options router-id 10.255.2.2</blockquote><p><span style="font-family: inherit;">Now, any routing protocol you happen to run on those routing-instances will have the router-id match the loopback IP address. </span></p><p><span style="font-family: courier;"> </span></p><p></p><span style="font-family: courier;"></span><p></p><p></p><h2 style="text-align: left;">Creating virtual interfaces (logical tunnels)</h2><p>The next handy virtual construct for a complex virtual lab topology is virtual interfaces - <i>logical tunnels</i>. You can simply create an interface like <span style="font-family: courier;">lt-0/0/0.0</span>, assign it an IP address and assign it to one of your virtual routing-instances. Logical tunnels, somewhat like loopbacks, never really go down, so they will still keep on going even if you unplug ethernet cables. Pretty handy!</p><p>There are a few minor complications with this. Firstly, you should think of them as point-to-point interfaces - indeed, you need to specify which <span style="font-family: courier;">lt</span> interfaces are "connected" with each other; you therefore need to configure them in pairs. </p><p>I like to keep some kind of consistency in naming and numbering so I don't have to think too hard about which bits belong with which other bits, so I keep things like virtual-router names and interface units consistent - so a virtual router called VR1 will use Unit 1 not only for a lo0 interface, but also for its lt interfaces; VR2 will use Unit 2 - and so on. Eventually, this can break down somewhat (if you've already used Unit 1 and Unit 2 to link router one and two together, what do you use to link router 2 to router 3? You can of course contrive more complex numbering schemes (like unit 12 links router 1 to router 2; unit 21 links router 2 to router 1 - and so on). Unit numbers are 14 bits long, so you <a href="https://www.juniper.net/documentation/en_US/junos/topics/reference/configuration-statement/unit-edit-interfaces-sv-interfaces.html" target="_blank">can have over 16,000 of them</a>! If you've not already come across the concept, you ALWAYS need to configure a logical sub-interface on a physical interface in Junos before you can use it - most commonly, you'll use unit 0, but you don't have to label sequentially, and you don't always have to assign unit 0 (as long as you assign at least one unit subinterface of some numeric value; in some cases, you are restricted to unit 0, but Junos will let you know if you're trying to do something it doesn't like during <span style="font-family: courier;">commit</span> or <span style="font-family: courier;">commit check</span>). <br />If you're applying IP addresses to VLAN units, it is strongly recommended that you match the unit number to the VLAN ID, although it is not mandatory. <br />If you've not noticed, you'll often see the unit number appended to the interface name - so lt-0/0/0.0 is unit 0 on lt-0/0/0. <br />You can save yourself a tiny bit of typing by specifying the interface with the unit number - so </p><p><span style="font-family: courier;"></span></p><blockquote><span style="font-family: courier;">set interface lt-0/0/0 <span style="background-color: #fcff01;">unit 0</span> </span><span style="font-family: courier;">family inet address 10.0.0.0/30</span></blockquote>and <br /><span style="font-family: courier;"><blockquote>set interface lt-0/0/0<span style="background-color: #fcff01;">.0</span> family inet address 10.0.0.0/30 </blockquote></span><p></p><p>are both valid <span style="font-family: courier;">set</span> commands for the same logical sub-interface; the former is the more commonly used command syntax - basically, note that typing <span style="font-family: courier;">.<unit number></span> after the physical interface is equivalent to <span style="font-family: courier;">unit <unit number></span><span style="font-family: inherit;">.</span></p><p>Labeling things (adding descriptions) can also be very useful, for example:</p><p><span style="font-family: courier;"></span></p><span style="font-family: courier;"><blockquote>set interfaces lt-0/0/0 unit 1 description VR1-VR2 <br />set interfaces lt-0/0/0.2 description "VR2 to VR1"</blockquote></span><p></p><p>If you want to include spaces in your text, make sure you quote the text - it can be a good habit to get into to always add quotes to your descriptions, just in case. </p><p>In order to create the logical tunnel, first set up the VR1 side. Note the <span style="font-family: courier;">peer-unit</span> is the unit number I want as the "other end" of the connection. In this instance, I want to assign Unit 1 to Virtual Router 1, and Unit 2 to Virtual Router 2 - and use these two interfaces to link between these two virtual routers. </p><blockquote><span style="font-family: courier;">set interfaces lt-0/0/0 unit 1 encapsulation vlan<br />set interfaces lt-0/0/0 unit 1 vlan-id 1<br />set interfaces lt-0/0/0 unit 1 peer-unit 2<br />set interfaces lt-0/0/0 unit 1 family inet address 10.10.1.1/30</span></blockquote><div>You can also do this as one very long set command, but <i>the order of the elements matters</i>; some attempts at this will be syntactically invalid. If you do them in the same order as show above (interface unit, encapsulation type, vlan-id, peer-unit, IP addressing) it works: </div><div><span style="font-family: courier;"><blockquote>set interfaces lt-0/0/0.1 encapsulation vlan vlan-id 1 peer-unit 2 family inet address 10.10.1.1/30 </blockquote></span></div>Next, set up the other side of the connection:<p></p><p><span style="font-family: courier;"></span></p><blockquote><span style="font-family: courier;">set interfaces lt-0/0/0 unit 2 encapsulation vlan<br />set interfaces lt-0/0/0 unit 2 vlan-id 1<br />set interfaces lt-0/0/0 unit 2 peer-unit 1<br />set interfaces lt-0/0/0 unit 2 family inet address 10.10.1.2/30</span></blockquote><p>(You could use the "one liner" syntax here, too. If you're struggling to work out the right order, pipe the output of a show command for an already configured grouping of stanzas to <span style="font-family: courier;">| display set</span> to see a working order for a potential one liner - e.g. <span style="font-family: courier;">show configuration interfaces lt-0/0/0.1 | display set</span> and follow that order). </p><p></p><p>Note that you <b>must</b> set an encapsulation type on <span style="font-family: courier;">lt</span> interfaces - <span style="font-family: courier;">vlan</span> should meet most needs for this sort of labbing; <span style="font-family: courier;">ethernet</span> may work just as well. You may want to use different VLAN IDs to keep things separate between router instances - but that of course the VLAN IDs on both sides of the connection need to match! Use one unique vlan-id per pair of interfaces.</p><p>This results in the following configuration listing: </p><p><span style="font-family: courier;"></span></p><blockquote style="text-align: left;"></blockquote><div><span style="font-family: courier;"></span></div><blockquote><div><span style="font-family: courier;">root@srx2> show configuration interfaces lt-0/0/0</span></div><div><span style="font-family: courier;">unit 1 {</span></div><div><span style="font-family: courier;"> description VR1-VR2;</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 1;</span></div><div><span style="font-family: courier;"> peer-unit 2;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.1/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;">}</span></div><div><span style="font-family: courier;">unit 2 {</span></div><div><span style="font-family: courier;"> description VR2-VR1;</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 1;</span></div><div><span style="font-family: courier;"> peer-unit 1;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.2/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;">}</span></div></blockquote><div><span style="font-family: courier;"></span></div><span style="font-family: courier;"><blockquote></blockquote></span><p></p><p>You can always try using /31 subnets for point-to-point interfaces - Junos supports them in certain scenarios, and they can add up to a lot of saved IP addresses (or even the more exotic <a href="https://www.juniper.net/documentation/en_US/junos/topics/topic-map/protocol-family-interface-address-properties.html#id-configuring-an-unnumbered-interface">un-numbered</a> interface). Of course, in a home lab you're not typically short of RFC1918 space - but particularly if you're using scarce publicly routable IPv4 addresses, you want to be efficient!</p><p>You then need to assign your newly minted interfaces to the relevant routing-instance. Issue the following commands to create virtual routers and assign interfaces to them: </p><p><span style="font-family: courier;"></span></p><blockquote><p><span style="font-family: courier;">set routing-instances VR1 instance-type virtual-router<br />set routing-instances VR1 interface lt-0/0/0.1<br />set routing-instances VR1 interface lo0.1</span></p><p><span style="font-family: courier;">set routing-instances VR2 instance-type virtual-router<br />set routing-instances VR2 interface lt-0/0/0.2<br />set routing-instances VR2 interface lo0.2</span></p></blockquote><p><span style="font-family: courier;"></span></p><div>In the configuration listing, this will look like this: </div><p><span style="font-family: courier;"></span></p><blockquote><p><span style="font-family: courier;">root@srx2# show routing-instances</span></p><p><span style="font-family: courier;">VR1 {<br /> instance-type virtual-router;<br /> interface lt-0/0/0.1;<br /> interface lo0.1;<br />}<br />VR2 {<br /> instance-type virtual-router;<br /> interface lt-0/0/0.2;<br /> interface lo0.2;<br />}</span></p></blockquote><p><span style="font-family: courier;"></span></p><div>You now have two routing instances, VR1 and VR2; they have both a loopback interface (<span style="font-family: courier;">lo0.<something></span>) and a point-to-point virtual logical tunnel interface (<span style="font-family: courier;">lt-0/0/0.<something></span>) between them. Time to check if it works!</div><div><br /></div><div>First, can you ping one from the other? </div><div><br /></div><div>Below, you'll see that you can ping the other side of the connection from each virtual router; that you can't ping the loopback on the remote router (no route exists); but that the respective local loopback is accessible from within each virtual router: </div><div><br /></div><div><div><span style="font-family: courier;"></span></div></div><blockquote><div><div><span style="font-family: courier;">root@srx2> ping inet routing-instance VR1 10.10.1.2</span></div><div><span style="font-family: courier;">PING 10.10.1.2 (10.10.1.2): 56 data bytes</span></div><div><span style="font-family: courier;">64 bytes from 10.10.1.2: icmp_seq=0 ttl=64 time=2.255 ms</span></div><div><span style="font-family: courier;">^C</span></div><div><span style="font-family: courier;">--- 10.10.1.2 ping statistics ---</span></div><div><span style="font-family: courier;">1 packets transmitted, 1 packets received, 0% packet loss</span></div><div><span style="font-family: courier;">round-trip min/avg/max/stddev = 2.255/2.255/2.255/0.000 ms</span></div><div><span style="font-family: courier;"><br /></span></div><div><span style="font-family: courier;">root@srx2> ping inet routing-instance VR2 10.10.1.1</span></div><div><span style="font-family: courier;">PING 10.10.1.1 (10.10.1.1): 56 data bytes</span></div><div><span style="font-family: courier;">64 bytes from 10.10.1.1: icmp_seq=0 ttl=64 time=4.186 ms</span></div><div><span style="font-family: courier;">64 bytes from 10.10.1.1: icmp_seq=1 ttl=64 time=2.209 ms</span></div><div><span style="font-family: courier;">^C</span></div><div><span style="font-family: courier;">--- 10.10.1.1 ping statistics ---</span></div><div><span style="font-family: courier;">2 packets transmitted, 2 packets received, 0% packet loss</span></div><div><span style="font-family: courier;">round-trip min/avg/max/stddev = 2.209/3.197/4.186/0.989 ms</span></div><div><span style="font-family: courier;"><br /></span></div><div><span style="font-family: courier;">root@srx2> ping inet routing-instance VR1 10.255.2.2</span></div><div><span style="font-family: courier;">PING 10.255.2.2 (10.255.2.2): 56 data bytes</span></div><div><span style="font-family: courier;">ping: sendto: No route to host</span></div><div><span style="font-family: courier;">ping: sendto: No route to host</span></div><div><span style="font-family: courier;">^C</span></div><div><span style="font-family: courier;">--- 10.255.2.2 ping statistics ---</span></div><div><span style="font-family: courier;">2 packets transmitted, 0 packets received, 100% packet loss</span></div><div><span style="font-family: courier;"><br /></span></div><div><span style="font-family: courier;">root@srx2> ping inet routing-instance VR2 10.255.2.1</span></div><div><span style="font-family: courier;">PING 10.255.2.1 (10.255.2.1): 56 data bytes</span></div><div><span style="font-family: courier;">ping: sendto: No route to host</span></div><div><span style="font-family: courier;">ping: sendto: No route to host</span></div><div><span style="font-family: courier;">^C</span></div><div><span style="font-family: courier;">--- 10.255.2.1 ping statistics ---</span></div><div><span style="font-family: courier;">2 packets transmitted, 0 packets received, 100% packet loss</span></div><div><span style="font-family: courier;"><br /></span></div><div><span style="font-family: courier;">root@srx2> ping inet routing-instance VR1 10.255.2.1</span></div><div><span style="font-family: courier;">PING 10.255.2.1 (10.255.2.1): 56 data bytes</span></div><div><span style="font-family: courier;">64 bytes from 10.255.2.1: icmp_seq=0 ttl=64 time=0.314 ms</span></div><div><span style="font-family: courier;">^C</span></div><div><span style="font-family: courier;">--- 10.255.2.1 ping statistics ---</span></div><div><span style="font-family: courier;">1 packets transmitted, 1 packets received, 0% packet loss</span></div><div><span style="font-family: courier;">round-trip min/avg/max/stddev = 0.314/0.314/0.314/0.000 ms</span></div><div><span style="font-family: courier;"><br /></span></div><div><span style="font-family: courier;">root@srx2> ping inet routing-instance VR2 10.255.2.2</span></div><div><span style="font-family: courier;">PING 10.255.2.2 (10.255.2.2): 56 data bytes</span></div><div><span style="font-family: courier;">64 bytes from 10.255.2.2: icmp_seq=0 ttl=64 time=1.292 ms</span></div><div><span style="font-family: courier;">^C</span></div><div><span style="font-family: courier;">--- 10.255.2.2 ping statistics ---</span></div><div><span style="font-family: courier;">1 packets transmitted, 1 packets received, 0% packet loss</span></div><div><span style="font-family: courier;">round-trip min/avg/max/stddev = 1.292/1.292/1.292/0.000 ms</span></div></div><div></div></blockquote><div>You can see that in Junos, you can specify which routing-instance you want to ping from quite easily. You should of course deduce you can't reach the loopback of the other router - because there is no route to it. </div><div><br /></div><div><h2 style="text-align: left;">Routing between virtual routers</h2></div><div>So how would we make routing work? We could use static routing, but that does not scale, and you need to get used to running an IGP. So, let's get OSPF working!</div><div><br /></div><div>As you want to route between the virtual routers themselves with other routers, you need to set up OSPF within each of the routing-instances. In Juniper, you basically tell it what OSPF area you want each interface to be part of, and off it goes: </div><div><div><span style="font-family: courier;"></span></div></div><blockquote><div><div><span style="font-family: courier;">set routing-instances VR1 protocols ospf area 0.0.0.0 interface lo0.1 passive</span></div><div><span style="font-family: courier;">set routing-instances VR1 protocols ospf area 0.0.0.0 interface lt-0/0/0.1</span> </div></div></blockquote><blockquote><div><div><span style="font-family: courier;">set routing-instances VR2 protocols ospf area 0.0.0.0 interface lt-0/0/0.2</span></div><div><span style="font-family: courier;">set routing-instances VR2 protocols ospf area 0.0.0.0 interface lo0.2 passive</span></div></div></blockquote><div>The <span style="font-family: courier;">passive</span> command on the loopback interfaces sets them into passive mode - OSPF can advertise the directly connected IP address(es), but it will not send OSPF hello messages or form an adjacency on a passive interface. It is generally best practice to set any interface you're running OSPF on that you don't actively intend to communicate to a neighbor [sic] as passive. If you didn't include the <span style="font-family: courier;">routing-instances <name></span> argument in the <span style="font-family: courier;">set protocols ospf</span> command, it would make these interfaces part of the main (physical) router's ospf calculations - which isn't quite what you want here. </div><div><br /></div><div>Can we now reach the loopback on the other end? </div><div><div><span style="font-family: courier;"></span></div><blockquote><div><span style="font-family: courier;">root@srx2> ping inet routing-instance VR2 10.10.1.1</span></div><div><span style="font-family: courier;">PING 10.10.1.1 (10.10.1.1): 56 data bytes</span></div><div><span style="font-family: courier;">64 bytes from 10.10.1.1: icmp_seq=0 ttl=64 time=2.113 ms</span></div><div><span style="font-family: courier;">64 bytes from 10.10.1.1: icmp_seq=1 ttl=64 time=2.343 ms</span></div><div><span style="font-family: courier;">^C</span></div><div><span style="font-family: courier;">--- 10.10.1.1 ping statistics ---</span></div><div><span style="font-family: courier;">2 packets transmitted, 2 packets received, 0% packet loss</span></div><div><span style="font-family: courier;">round-trip min/avg/max/stddev = 2.113/2.228/2.343/0.115 ms</span></div><div><span style="font-family: courier;"><br /></span></div><div><span style="font-family: courier;">root@srx2> ping inet routing-instance VR1 10.255.2.2</span></div><div><span style="font-family: courier;">PING 10.255.2.2 (10.255.2.2): 56 data bytes</span></div><div><span style="font-family: courier;">64 bytes from 10.255.2.2: icmp_seq=0 ttl=64 time=2.062 ms</span></div><div><span style="font-family: courier;">64 bytes from 10.255.2.2: icmp_seq=1 ttl=64 time=2.274 ms</span></div><div><span style="font-family: courier;">^C</span></div><div><span style="font-family: courier;">--- 10.255.2.2 ping statistics ---</span></div><div><span style="font-family: courier;">2 packets transmitted, 2 packets received, 0% packet loss</span></div><div><span style="font-family: courier;">round-trip min/avg/max/stddev = 2.062/2.168/2.274/0.106 ms</span></div></blockquote><div><span style="font-family: courier;"></span></div></div><div>Indeed we can! </div><div><br /></div><div>This means dynamic routing is working between the two virtual routers over a virtual interface. Fancy!</div><div><br /></div><div>We can have a look at VR1's routing table to see where it is getting that information from:</div><div><div><span style="font-family: courier;"></span></div><blockquote><div><span style="font-family: courier;">root@srx2> show route table VR1.inet.0</span></div><div><span style="font-family: courier;"><br /></span></div><div><span style="font-family: courier;">VR1.inet.0: 5 destinations, 5 routes (5 active, 0 holddown, 0 hidden)</span></div><div><span style="font-family: courier;">+ = Active Route, - = Last Active, * = Both</span></div><div><span style="font-family: courier;"><br /></span></div><div><span style="font-family: courier;">10.10.1.0/30 *[Direct/0] 01:20:47</span></div><div><span style="font-family: courier;"> > via lt-0/0/0.1</span></div><div><span style="font-family: courier;">10.10.1.1/32 *[Local/0] 01:20:47</span></div><div><span style="font-family: courier;"> Local via lt-0/0/0.1</span></div><div><span style="font-family: courier;">10.255.2.1/32 *[Direct/0] 01:47:51</span></div><div><span style="font-family: courier;"> > via lo0.1</span></div><div><span style="font-family: courier;">10.255.2.2/32 *[OSPF/10] 00:06:40, metric 1</span></div><div><span style="font-family: courier;"> > to 10.10.1.2 via lt-0/0/0.1</span></div></blockquote><div><span style="font-family: courier;"></span></div></div><div>Unsurprisingly, we can see that we've learned the VR2 router loopback IP (10.255.2.2) from OSPF as shown in the last line. Note the format of the routing table name - virtual router instance name <dot> inet <dot> 0. inet.0 is the routing table for IPv4 across Junos; each virtual router will create its own separate <name>.inet.0 table. In this case, obviously, we looked at the table VR1.inet.0. </div><div><br /></div><div>You can further check the status of OSPF in a few ways. </div><div>Firstly show <span style="font-family: courier;">ospf neighbor instance <instance name></span></div><div><br /></div><div><div><span style="font-family: courier;"></span></div><blockquote><div><span style="font-family: courier;">root@srx2> show ospf neighbor instance VR1</span></div><div><span style="font-family: courier;">Address Interface State ID Pri Dead</span></div><div><span style="font-family: courier;">10.10.1.2 lt-0/0/0.1 Full 10.255.2.2 128 33</span></div></blockquote><div><span style="font-family: courier;"></span></div></div><div>Typically, you want to see the neighbors [sic] in the state "Full". </div><div><br /></div><div>You can also check the OSPF link state database for a routing instance, for example: </div><div><br /></div><div><div><span style="font-family: courier;"></span></div><blockquote><div><span style="font-family: courier;">root@srx2> show ospf database instance VR1</span></div><div><span style="font-family: courier;"><br /></span></div><div><span style="font-family: courier;"> OSPF database, Area 0.0.0.0</span></div><div><span style="font-family: courier;"> Type ID Adv Rtr Seq Age Opt Cksum Len</span></div><div><span style="font-family: courier;">Router *10.255.2.1 10.255.2.1 0x80000005 977 0x22 0x7852 48</span></div><div><span style="font-family: courier;">Router 10.255.2.2 10.255.2.2 0x80000006 978 0x22 0x344c 60</span></div><div><span style="font-family: courier;">Router 10.255.2.3 10.255.2.3 0x80000005 979 0x22 0x6b54 48</span></div><div><span style="font-family: courier;">Network 10.10.1.2 10.255.2.2 0x80000002 983 0x22 0xfffb 32</span></div><div><span style="font-family: courier;">Network 10.10.1.6 10.255.2.3 0x80000002 979 0x22 0xe513 32</span></div></blockquote><div><span style="font-family: courier;"></span></div></div><div>More usefully: </div><div><br /></div><div><div><span style="font-family: courier;"></span></div><blockquote><div><span style="font-family: courier;">root@srx2> show ospf route instance VR1</span></div><div><span style="font-family: courier;">Topology default Route Table:</span></div><div><span style="font-family: courier;"><br /></span></div><div><span style="font-family: courier;">Prefix Path Route NH Metric NextHop Nexthop</span></div><div><span style="font-family: courier;"> Type Type Type Interface Address/LSP</span></div><div><span style="font-family: courier;">10.255.2.2 Intra Router IP 1 lt-0/0/0.1 10.10.1.2</span></div><div><span style="font-family: courier;">10.255.2.3 Intra Router IP 2 lt-0/0/0.1 10.10.1.2</span></div><div><span style="font-family: courier;">10.10.1.0/30 Intra Network IP 1 lt-0/0/0.1</span></div><div><span style="font-family: courier;">10.10.1.4/30 Intra Network IP 2 lt-0/0/0.1 10.10.1.2</span></div><div><span style="font-family: courier;">10.255.2.1/32 Intra Network IP 0 lo0.1</span></div><div><span style="font-family: courier;">10.255.2.2/32 Intra Network IP 1 lt-0/0/0.1 10.10.1.2</span></div><div><span style="font-family: courier;">10.255.2.3/32 Intra Network IP 2 lt-0/0/0.1 10.10.1.2</span></div></blockquote><div><span style="font-family: courier;"></span></div></div><div>Basically any of the OSPF operational commands can be made specific to the relevant routing-instance by appending <span style="font-family: courier;">instance <instance name></span> to the end of the command; without this, the command will be executed against the main router's OSPF instance.</div><div><br /></div><div>One thing that might surprise you (at first) is that you can't ping the virtual loopback from the non-virtual router itself!</div><div><div><span style="font-family: courier;"></span></div><blockquote><div><span style="font-family: courier;">root@srx2> ping 10.255.2.1</span></div><div><span style="font-family: courier;">PING 10.255.2.1 (10.255.2.1): 56 data bytes</span></div><div><span style="font-family: courier;">ping: sendto: No route to host</span></div></blockquote><p>Note that we haven't specified a routing instance or table - we've just told the router to use its own resources - in other words, use inet.0. </p><div><span style="font-family: courier;"></span></div></div><div>But, of course, that is <i>exactly</i> what we asked for - a separate, albeit virtual, router! To be able to ping the virtual routers, we need to connect the physical router up to the virtual one in some way, and ensure routing is working - more virtual tunnels, or use of sub-interfaces on a physical link that is up would allow this. An exercise for the reader! :) </div><div><br /></div><div>Here's the physical router's inet.0 table. Pretty tiny at the moment!</div><div><br /></div><div><div><span style="font-family: courier;"></span></div></div><blockquote><div><div><span style="font-family: courier;">root@srx2> show route table inet.0</span></div><div><span style="font-family: courier;"><br /></span></div><div><span style="font-family: courier;">inet.0: 2 destinations, 2 routes (2 active, 0 holddown, 0 hidden)</span></div><div><span style="font-family: courier;">+ = Active Route, - = Last Active, * = Both</span></div><div><span style="font-family: courier;"><br /></span></div><div><span style="font-family: courier;">10.255.255.2/32 *[Direct/0] 02:12:49</span></div><div><span style="font-family: courier;"> > via lo0.0</span></div><div><span style="font-family: courier;">192.168.1.1/32 *[Local/0] 08:09:28</span></div><div><span style="font-family: courier;"> Reject</span></div></div><div></div></blockquote><div><br /></div><div>If you want to create entirely "virtual routers" with separate management credentials and so on, take a look at <a href="https://www.juniper.net/documentation/en_US/junos/topics/topic-map/logical-systems-overview.html" target="_blank">logical systems</a>; this is obviously more onerous for the hardware, and is a licensed feature on many platforms - you might do this if, for example, you had to hand over management of part of the system to another department. </div><p><br /></p><p></p><h2 style="text-align: left;">Disabling "always up" logical tunnels and virtual routers</h2><p>If you are used to simply pulling out cables to simulate link failures, you may be wondering how you can cut some virtual wires. The easiest way is to simply deactivate one of the logical interfaces in the link. </p><p><span style="font-family: courier;"></span></p><blockquote><span style="font-family: courier;">deactivate interfaces lt-0/0/0.2 </span></blockquote><p></p><p>will cause the link between VR1 and VR2 to go down (with all the problems that may cause your topology). You'll have to commit the change before it will take effect. To reverse this, either <span style="font-family: courier;">rollback 1</span> and commit, or </p><p><span style="font-family: courier;"></span></p><blockquote><span style="font-family: courier;">activate interfaces lt-/0/0/.2</span></blockquote><p></p><p>and commit the change. </p><p>You could even go so far as to disable entire virtual routers - similarly, you would use the <span style="font-family: courier;">deactivate</span> command on <span style="font-family: courier;">routing-instance <name></span><span style="font-family: inherit;"> and commit the change, perhaps simulating failure of a whole router or site.</span></p><p><span style="font-family: inherit;"><br /></span></p><h1 style="text-align: left;">Final Topology</h1><p>I've not run through every step of this, but a possible end result could look something like this: </p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgSsUpkY4gmHOOFAZiImEUEI1uB8f-egtVHiky0HjRWzZgdK9sg3VQAEkWV-dHFJ_KaUO3NbSAfpEqgVTBvFrMGc2hO9rUb6LJieDmVmiskA1CrcBUn9ET3xi7Bk0ga6-6gBTGAy_AWhJQ/s1401/SRXTopology.png" style="margin-left: auto; margin-right: auto;"><img alt="Topology illustrating interface numbering for 8 virtual routers" border="0" data-original-height="583" data-original-width="1401" height="260" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgSsUpkY4gmHOOFAZiImEUEI1uB8f-egtVHiky0HjRWzZgdK9sg3VQAEkWV-dHFJ_KaUO3NbSAfpEqgVTBvFrMGc2hO9rUb6LJieDmVmiskA1CrcBUn9ET3xi7Bk0ga6-6gBTGAy_AWhJQ/w625-h260/SRXTopology.png" width="625" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">8 router topology - all created using virtual routers and virtual logical tunnel interfaces.<br />Diagram created with <a href="http://draw.io">draw.io</a>.</td></tr></tbody></table><p>To me, this represents:</p><p></p><ul style="text-align: left;"><li>some Customer Premises Equipment (CPE) (VR 1 & 8); </li><li>some Point-of-Presence (PoP) Provider Edge (PE) routers (VR 2 & 7); </li><li>a resilient core network (VR 3,4,5 & 6). </li></ul><p></p><p>With something like this, you can experiment extensively!</p><p>You could, for instance: </p><p></p><ul style="text-align: left;"><li>start adding or changing the routing protocol or configuration associated with it, perhaps simulating a change (or even a live migration) of IGP from OSPF to ISIS, or adding BGP;</li><li>cause yourself immense pain by removing IGP and only using static routes; </li><li>add in MPLS;</li><li>add in additional PoP and CPE sites;</li><li>multiply connect CPE device(s); </li><li>add in an MP-BGP L3VPN between the CPEs or some other provider-style VPN link transparent to layer 2 (with all the required changes to all of the routers to make it work);</li><li>check the path traffic takes through the topology;</li><li>change the path traffic takes through the topology with the relevant configuration knobs, or see what effect router/link outages have; </li><li>add or subtract available paths between routers, particularly something like LACP;</li><li>get the physical router to participate in this topology with OSPF and suitable interfaces;</li><li>add a physical interface into the topology on one of the routers that allows a physical device (like your own laptop) to participate in the topology - perhaps one of the CPEs, complete with a DHCP server you configure on the virtual router;</li><li>get the virtual topology to inter-operate with a physical router from another manufacturer;</li><li>allow Internet access from the virtual topology;</li><li>experiment with route filtering and policy; </li><li>change the roles of the routers, adding or removing services, as needed; </li><li>add in more routers, perhaps simulating connections to an IXP and/or one or more transit ISPs. </li></ul><p></p><p>Given a topology like this with all the point-to-point and loopbacks created, addressed, and interconnected - and a working IGP - you have a very flexible plaything that can teach you a lot. As I mentioned earlier, designing and building your own topology is a much more powerful learning aid, but sometimes, you want to see a working one and get a little more familiar with some of the concepts before you take the plunge yourself. I hope this helps you!</p><p><br /></p><h1 style="text-align: left;">Config listing</h1><p>If you want to just copy and paste a topology (the one in the diagram above, in fact), here you go; it should work on most Juniper devices, but it was developed and tested on a SRX110H2-VA: </p><div><span style="font-family: courier;"></span></div><blockquote><div><span style="font-family: courier;">system {</span></div><div><span style="font-family: courier;"> host-name srx2;</span></div><div><span style="font-family: courier;"> name-server {</span></div><div><span style="font-family: courier;"> 208.67.222.222;</span></div><div><span style="font-family: courier;"> 208.67.220.220;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> services {</span></div><div><span style="font-family: courier;"> ssh;</span></div><div><span style="font-family: courier;"> xnm-clear-text;</span></div><div><span style="font-family: courier;"> web-management {</span></div><div><span style="font-family: courier;"> http {</span></div><div><span style="font-family: courier;"> interface vlan.0;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> https {</span></div><div><span style="font-family: courier;"> system-generated-certificate;</span></div><div><span style="font-family: courier;"> interface vlan.0;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> dhcp {</span></div><div><span style="font-family: courier;"> pool 192.168.1.0/24 {</span></div><div><span style="font-family: courier;"> address-range low 192.168.1.2 high 192.168.1.254;</span></div><div><span style="font-family: courier;"> router {</span></div><div><span style="font-family: courier;"> 192.168.1.1;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> propagate-settings fe-0/0/0.0;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> syslog {</span></div><div><span style="font-family: courier;"> archive size 100k files 3;</span></div><div><span style="font-family: courier;"> user * {</span></div><div><span style="font-family: courier;"> any emergency;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> file messages {</span></div><div><span style="font-family: courier;"> any critical;</span></div><div><span style="font-family: courier;"> authorization info;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> file interactive-commands {</span></div><div><span style="font-family: courier;"> interactive-commands error;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> max-configurations-on-flash 5;</span></div><div><span style="font-family: courier;"> max-configuration-rollbacks 5;</span></div><div><span style="font-family: courier;"> license {</span></div><div><span style="font-family: courier;"> autoupdate {</span></div><div><span style="font-family: courier;"> url https://ae1.juniper.net/junos/key_retrieval;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;">}</span></div><div><span style="font-family: courier;">interfaces {</span></div><div><span style="font-family: courier;"> fe-0/0/0 {</span></div><div><span style="font-family: courier;"> unit 0 {</span></div><div><span style="font-family: courier;"> family inet;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> lt-0/0/0 {</span></div><div><span style="font-family: courier;"> unit 1 {</span></div><div><span style="font-family: courier;"> description VR1-VR2;</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 1;</span></div><div><span style="font-family: courier;"> peer-unit 2;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.1/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 2 {</span></div><div><span style="font-family: courier;"> description VR2-VR1;</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 1;</span></div><div><span style="font-family: courier;"> peer-unit 1;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.2/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 23 {</span></div><div><span style="font-family: courier;"> description VR2-VR3;</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 23;</span></div><div><span style="font-family: courier;"> peer-unit 32;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.5/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 25 {</span></div><div><span style="font-family: courier;"> description VR2-VR5;</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 25;</span></div><div><span style="font-family: courier;"> peer-unit 52;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.9/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 32 {</span></div><div><span style="font-family: courier;"> description VR3-VR2;</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 23;</span></div><div><span style="font-family: courier;"> peer-unit 23;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.6/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 34 {</span></div><div><span style="font-family: courier;"> description "VR3 to VR4";</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 34;</span></div><div><span style="font-family: courier;"> peer-unit 43;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.17/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 35 {</span></div><div><span style="font-family: courier;"> description "VR3 to VR5";</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 35;</span></div><div><span style="font-family: courier;"> peer-unit 53;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.13/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 36 {</span></div><div><span style="font-family: courier;"> description "VR3 to VR6";</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 36;</span></div><div><span style="font-family: courier;"> peer-unit 63;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.25/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 43 {</span></div><div><span style="font-family: courier;"> description "VR4 to VR3";</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 34;</span></div><div><span style="font-family: courier;"> peer-unit 34;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.18/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 45 {</span></div><div><span style="font-family: courier;"> description "VR4 to VR5";</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 45;</span></div><div><span style="font-family: courier;"> peer-unit 54;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.30/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 46 {</span></div><div><span style="font-family: courier;"> description "VR4 to VR6";</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 46;</span></div><div><span style="font-family: courier;"> peer-unit 64;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.34/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 47 {</span></div><div><span style="font-family: courier;"> description "VR4 to VR7";</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 47;</span></div><div><span style="font-family: courier;"> peer-unit 74;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.37/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 52 {</span></div><div><span style="font-family: courier;"> description VR5-VR2;</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 25;</span></div><div><span style="font-family: courier;"> peer-unit 25;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.10/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 53 {</span></div><div><span style="font-family: courier;"> description "VR5 to VR3";</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 35;</span></div><div><span style="font-family: courier;"> peer-unit 35;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.14/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 54 {</span></div><div><span style="font-family: courier;"> description "VR5 to VR4";</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 45;</span></div><div><span style="font-family: courier;"> peer-unit 45;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.29/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 56 {</span></div><div><span style="font-family: courier;"> description "VR5 to VR6";</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 56;</span></div><div><span style="font-family: courier;"> peer-unit 65;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.21/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 63 {</span></div><div><span style="font-family: courier;"> description "VR6 to VR3";</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 36;</span></div><div><span style="font-family: courier;"> peer-unit 36;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.26/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 64 {</span></div><div><span style="font-family: courier;"> description "VR6 to VR4";</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 46;</span></div><div><span style="font-family: courier;"> peer-unit 46;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.33/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 65 {</span></div><div><span style="font-family: courier;"> description "VR6 to VR5";</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 56;</span></div><div><span style="font-family: courier;"> peer-unit 56;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.22/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 67 {</span></div><div><span style="font-family: courier;"> description "VR6 to VR7";</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 67;</span></div><div><span style="font-family: courier;"> peer-unit 76;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.41/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 74 {</span></div><div><span style="font-family: courier;"> description "VR7 to VR4";</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 47;</span></div><div><span style="font-family: courier;"> peer-unit 47;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.38/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 76 {</span></div><div><span style="font-family: courier;"> description "VR7 to VR6";</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 67;</span></div><div><span style="font-family: courier;"> peer-unit 67;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.42/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 78 {</span></div><div><span style="font-family: courier;"> description "VR7 to VR8";</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 78;</span></div><div><span style="font-family: courier;"> peer-unit 87;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.45/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 87 {</span></div><div><span style="font-family: courier;"> description "VR8 to VR7";</span></div><div><span style="font-family: courier;"> encapsulation vlan;</span></div><div><span style="font-family: courier;"> vlan-id 78;</span></div><div><span style="font-family: courier;"> peer-unit 78;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.10.1.46/30;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> fe-0/0/1 {</span></div><div><span style="font-family: courier;"> unit 0 {</span></div><div><span style="font-family: courier;"> family ethernet-switching {</span></div><div><span style="font-family: courier;"> vlan {</span></div><div><span style="font-family: courier;"> members vlan-trust;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> fe-0/0/2 {</span></div><div><span style="font-family: courier;"> unit 0 {</span></div><div><span style="font-family: courier;"> family ethernet-switching {</span></div><div><span style="font-family: courier;"> vlan {</span></div><div><span style="font-family: courier;"> members vlan-trust;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> fe-0/0/3 {</span></div><div><span style="font-family: courier;"> unit 0 {</span></div><div><span style="font-family: courier;"> family ethernet-switching {</span></div><div><span style="font-family: courier;"> vlan {</span></div><div><span style="font-family: courier;"> members vlan-trust;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> fe-0/0/4 {</span></div><div><span style="font-family: courier;"> unit 0 {</span></div><div><span style="font-family: courier;"> family ethernet-switching {</span></div><div><span style="font-family: courier;"> vlan {</span></div><div><span style="font-family: courier;"> members vlan-trust;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> fe-0/0/5 {</span></div><div><span style="font-family: courier;"> unit 0 {</span></div><div><span style="font-family: courier;"> family ethernet-switching {</span></div><div><span style="font-family: courier;"> vlan {</span></div><div><span style="font-family: courier;"> members vlan-trust;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> fe-0/0/6 {</span></div><div><span style="font-family: courier;"> unit 0 {</span></div><div><span style="font-family: courier;"> family ethernet-switching {</span></div><div><span style="font-family: courier;"> vlan {</span></div><div><span style="font-family: courier;"> members vlan-trust;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> fe-0/0/7 {</span></div><div><span style="font-family: courier;"> vlan-tagging;</span></div><div><span style="font-family: courier;"> unit 99 {</span></div><div><span style="font-family: courier;"> description "client interface for VR1";</span></div><div><span style="font-family: courier;"> vlan-id 99;</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 192.168.99.1/24;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> pt-1/0/0 {</span></div><div><span style="font-family: courier;"> unit 0 {</span></div><div><span style="font-family: courier;"> family inet;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> lo0 {</span></div><div><span style="font-family: courier;"> unit 0 {</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.255.255.2/32;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 1 {</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.255.2.1/32;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 2 {</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.255.2.2/32;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 3 {</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.255.2.3/32;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 4 {</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.255.2.4/32;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 5 {</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.255.2.5/32;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 6 {</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.255.2.6/32;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 7 {</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.255.2.7/32;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 8 {</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 10.255.2.8/32;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> vlan {</span></div><div><span style="font-family: courier;"> unit 0 {</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 192.168.0.1/24;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> unit 1 {</span></div><div><span style="font-family: courier;"> description "VLAN 1 on VR1 - client subnet";</span></div><div><span style="font-family: courier;"> family inet {</span></div><div><span style="font-family: courier;"> address 192.168.1.1/24;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;">}</span></div><div><span style="font-family: courier;">protocols {</span></div><div><span style="font-family: courier;"> rstp;</span></div><div><span style="font-family: courier;">}</span></div><div><span style="font-family: courier;">security {</span></div><div><span style="font-family: courier;"> forwarding-options {</span></div><div><span style="font-family: courier;"> family {</span></div><div><span style="font-family: courier;"> mpls {</span></div><div><span style="font-family: courier;"> mode packet-based;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;">}</span></div><div><span style="font-family: courier;">routing-instances {</span></div><div><span style="font-family: courier;"> VR1 {</span></div><div><span style="font-family: courier;"> description CPE1;</span></div><div><span style="font-family: courier;"> instance-type virtual-router;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.1;</span></div><div><span style="font-family: courier;"> interface fe-0/0/7.99;</span></div><div><span style="font-family: courier;"> interface lo0.1;</span></div><div><span style="font-family: courier;"> routing-options {</span></div><div><span style="font-family: courier;"> router-id 10.255.2.1;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> protocols {</span></div><div><span style="font-family: courier;"> ospf {</span></div><div><span style="font-family: courier;"> area 0.0.0.0 {</span></div><div><span style="font-family: courier;"> interface lo0.1 {</span></div><div><span style="font-family: courier;"> passive;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.1;</span></div><div><span style="font-family: courier;"> interface fe-0/0/7.99 {</span></div><div><span style="font-family: courier;"> passive;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> VR2 {</span></div><div><span style="font-family: courier;"> description PE1;</span></div><div><span style="font-family: courier;"> instance-type virtual-router;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.2;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.23;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.25;</span></div><div><span style="font-family: courier;"> interface lo0.2;</span></div><div><span style="font-family: courier;"> routing-options {</span></div><div><span style="font-family: courier;"> router-id 10.255.2.2;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> protocols {</span></div><div><span style="font-family: courier;"> ospf {</span></div><div><span style="font-family: courier;"> area 0.0.0.0 {</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.2;</span></div><div><span style="font-family: courier;"> interface lo0.2 {</span></div><div><span style="font-family: courier;"> passive;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.23;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.25;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> VR3 {</span></div><div><span style="font-family: courier;"> description P1;</span></div><div><span style="font-family: courier;"> instance-type virtual-router;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.32;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.34;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.35;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.36;</span></div><div><span style="font-family: courier;"> interface lo0.3;</span></div><div><span style="font-family: courier;"> routing-options {</span></div><div><span style="font-family: courier;"> router-id 10.255.2.3;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> protocols {</span></div><div><span style="font-family: courier;"> ospf {</span></div><div><span style="font-family: courier;"> area 0.0.0.0 {</span></div><div><span style="font-family: courier;"> interface lo0.3 {</span></div><div><span style="font-family: courier;"> passive;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.32;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.35;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.34;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.36;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> VR4 {</span></div><div><span style="font-family: courier;"> description P2;</span></div><div><span style="font-family: courier;"> instance-type virtual-router;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.43;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.45;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.46;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.47;</span></div><div><span style="font-family: courier;"> interface lo0.4;</span></div><div><span style="font-family: courier;"> routing-options {</span></div><div><span style="font-family: courier;"> router-id 10.255.2.4;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> protocols {</span></div><div><span style="font-family: courier;"> ospf {</span></div><div><span style="font-family: courier;"> area 0.0.0.0 {</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.43;</span></div><div><span style="font-family: courier;"> interface lo0.4 {</span></div><div><span style="font-family: courier;"> passive;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.45;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.46;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.47;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> VR5 {</span></div><div><span style="font-family: courier;"> description P3;</span></div><div><span style="font-family: courier;"> instance-type virtual-router;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.52;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.53;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.54;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.56;</span></div><div><span style="font-family: courier;"> interface lo0.5;</span></div><div><span style="font-family: courier;"> routing-options {</span></div><div><span style="font-family: courier;"> router-id 10.255.2.5;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> protocols {</span></div><div><span style="font-family: courier;"> ospf {</span></div><div><span style="font-family: courier;"> area 0.0.0.0 {</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.52;</span></div><div><span style="font-family: courier;"> interface lo0.5 {</span></div><div><span style="font-family: courier;"> passive;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.53;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.54;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.56;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> VR6 {</span></div><div><span style="font-family: courier;"> description P4;</span></div><div><span style="font-family: courier;"> instance-type virtual-router;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.63;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.64;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.65;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.67;</span></div><div><span style="font-family: courier;"> interface lo0.6;</span></div><div><span style="font-family: courier;"> routing-options {</span></div><div><span style="font-family: courier;"> router-id 10.255.2.6;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> protocols {</span></div><div><span style="font-family: courier;"> ospf {</span></div><div><span style="font-family: courier;"> area 0.0.0.0 {</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.63;</span></div><div><span style="font-family: courier;"> interface lo0.6 {</span></div><div><span style="font-family: courier;"> passive;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.65;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.64;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.67;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> VR7 {</span></div><div><span style="font-family: courier;"> description PE2;</span></div><div><span style="font-family: courier;"> instance-type virtual-router;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.74;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.76;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.78;</span></div><div><span style="font-family: courier;"> interface lo0.7;</span></div><div><span style="font-family: courier;"> routing-options {</span></div><div><span style="font-family: courier;"> router-id 10.255.2.7;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> protocols {</span></div><div><span style="font-family: courier;"> ospf {</span></div><div><span style="font-family: courier;"> area 0.0.0.0 {</span></div><div><span style="font-family: courier;"> interface lo0.7 {</span></div><div><span style="font-family: courier;"> passive;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.74;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.76;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.78;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> VR8 {</span></div><div><span style="font-family: courier;"> description CPE2;</span></div><div><span style="font-family: courier;"> instance-type virtual-router;</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.87;</span></div><div><span style="font-family: courier;"> interface lo0.8;</span></div><div><span style="font-family: courier;"> routing-options {</span></div><div><span style="font-family: courier;"> router-id 10.255.2.8;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> protocols {</span></div><div><span style="font-family: courier;"> ospf {</span></div><div><span style="font-family: courier;"> area 0.0.0.0 {</span></div><div><span style="font-family: courier;"> interface lo0.8 {</span></div><div><span style="font-family: courier;"> passive;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> interface lt-0/0/0.87;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;">}</span></div><div><span style="font-family: courier;">vlans {</span></div><div><span style="font-family: courier;"> vlan-trust {</span></div><div><span style="font-family: courier;"> vlan-id 3;</span></div><div><span style="font-family: courier;"> l3-interface vlan.0;</span></div><div><span style="font-family: courier;"> }</span></div><div><span style="font-family: courier;">}</span></div><div></div></blockquote><div>You may want to copy/paste through a text editor to get rid of any extraneous formatting it may pick up from being HTML-ised!</div><p>You may find <a href="https://kb.juniper.net/InfoCenter/index?page=content&id=KB10817&cat=SRX_5600_1&actp=LIST" target="_blank">this article</a> useful. </p><p>Note that no users or root authentication have been configured above - you should do this. </p>James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com0tag:blogger.com,1999:blog-5949954951402579361.post-3940190704854099502020-08-28T12:08:00.010+01:002020-08-28T12:35:12.477+01:00Holistic IT education / learning: secure all the thingsIn July, I witnessed a very interesting PoC in a talk, sketched out against a particular vendor's routers based on the "best practice" router hardening firewall configuration example given in a well recognised, highly thought of book. <div><br /></div><div>This lead me to thinking about the need for more thorough consideration of IT security throughout careers, and in particular, the danger of blindly relying on other people's information. </div><div><br /></div><div>I've embargoed this post until now, because it contains a low-content description of a potential vulnerability, and I can see that steps that should address it have been taken - hence the publication date; this was written shortly after that talk spurred some thoughts...</div><span><a name='more'></a></span><div><br /></div><div><br /></div><div>I'm a big believer in two things: education/learning, and knowing about a lot of different things. I tend to pick up on tangentially relevant information all the time. However, I think this needs to be much more rigorously pursued in a certain context by all IT professionals - security and its implications. The other aspect of this is rigorous <i>application</i> of that knowledge!</div><div><br /></div><div>Most IT people have at least some grasp of security. They're rarely going to just run code or script snippets they find online (<a href="https://blog.zsec.uk/cve-2020-1350-research/">or do they?</a>) - but they may be less wary depending on the source, and, particularly, if it gets into published books by well established networking luminaries from top tier publishers. They're probably doing most, or all of the other best practice security stuff, too. They know there are still probably gaps. </div><div>But are there times we ourselves make things worse through action rather than inaction? </div><div>Have we put something in place that is <i>worse</i> than the default state?</div><div><br /></div><div>Note: I'll mainly be talking about "traffic" or "packets", but this applies equally to consider how you handle any "authentication", "data", "trust" or "input" and IT processes/services of any kind! </div><div><br /></div><h1 style="text-align: left;">Beware the "trusted recipe"</h1><div>How wary are you of "trusted recipes"? </div><div><br /></div><div>It appears quite a lot of people aren't wary of a trusted recipe, because the speaker found this compromise worked against several networks of significant scale, all of which probably have very experienced network "architects" and "engineers", most of whom probably think or can demonstrate that they have significant security experience. But <a href="https://www.oreilly.com/library/view/juniper-mx-series/9781491932711/ch04.html">this</a> got them, nonetheless. Likely, they assumed the authors had considered (and perhaps even tested) the impacts of the "best practice" configuration they were suggesting, or that it had the manufacturer's tacit blessing. </div><div><br /></div><div>Unfortunately, the oversight in this "trusted recipe" leads to the ability to directly connect to core router management planes from internet addresses - across the internet (with a few underlying assumptions, which commonly exist). This only happens if you implement a particular part of this best practice hardening script, or come up with the same idea independently to "harden" your router whilst allowing certain justifiable actions to happen. Oops. It's not an instant compromise or RCE, but it is problematic, and exploitable for nefarious subsequent use in a variety of underhanded ways.</div><div><br /></div><div>As soon as someone context-shifted my brain from "achieve an end goal" to "consider the security implications", I had an immediate light bulb moment that was precisely congruent with the rest of the talk based on the relevant config snippet - this line was a disaster. It is blindingly obvious once you pause to think about it (as so many transformative experiences are). It is surprising how many people don't have this lightbulb moment - even where we expect to find (very) skilled and experienced professionals. That's the gap we need to address.</div><div><br /></div><div>In intentionally very sketchy summary, it requires a crafted (but trivially so) packet that exploits an assumption being made about the type of traffic they're trying to allow through the router - for a tool hardly anyone uses, but that arguably has "nice internet citizen" written on it to allow to work properly through your internet routers - a laudable goal in and of itself. </div><div><br /></div><div>The discoverer has done the responsible disclosure thing, contacted the publisher, equipment vendor, and the networks they found out were vulnerable. You may or may not notice <a href="https://www.oreilly.com/catalog/errata.csp?isbn=0636920042709">errata </a>or advisories stemming from this at some later date; I hope so. Informed networks of course quickly remedied the issue.</div><div><br /></div><div>I think this underlines the need for (even?) better security awareness and training. </div><div><br /></div><div>I think we all regularly read and look for known security issues in the products and services we use - and certainly read the heck out of release notes looking for trouble - but how often do we look for the errata of technical publications like books?</div><div><br /></div><div>There are possibly thousands of code/config snippets that work, solve a problem, and as a side-effect, open a hole large enough to drive a supertanker through. Some systems even ship in states that are arguably like this. It stands to reason then you're most at risk when you know not what you do (standing proudly atop the <a href="https://schoolsysadmin.blogspot.com/2020/07/dunning-kruger-and-learning.html">summit of Mount Stupid of Dunning-Kruger</a>), when you're in a rush - and when you don't stop to evaluate things carefully enough, no matter the reason. </div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEji0u9psUdfGxzrkEAahyphenhyphenPnAzFTUcuk_4dIVHZFCT9NtfrYXHzckbPHszMK8XRuL1uZolNGKE0MOaW1n5ozuKxcHUsxkt5GjOMm6A-t4jti-lzBfOPRaTK7n6wqLJiAEmXjzgaUZLDM1r0/s500/learnallthethings.jpg" style="margin-left: auto; margin-right: auto;"><img alt="Learn all the Things" border="0" data-original-height="355" data-original-width="500" height="227" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEji0u9psUdfGxzrkEAahyphenhyphenPnAzFTUcuk_4dIVHZFCT9NtfrYXHzckbPHszMK8XRuL1uZolNGKE0MOaW1n5ozuKxcHUsxkt5GjOMm6A-t4jti-lzBfOPRaTK7n6wqLJiAEmXjzgaUZLDM1r0/w320-h227/learnallthethings.jpg" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><span style="text-align: left;"><b>Learn all the things!</b><br /><i>Derivative meme of Allie Brosh's <a href="http://hyperboleandahalf.blogspot.com/2010/06/this-is-why-ill-never-be-adult.html">Clean all the Things</a>!</i></span></td></tr></tbody></table><div><br /><br /></div><h1 style="text-align: left;">What's good security training, anyway?</h1><div>First up, I don't think everyone involved in IT <i>needs</i> to become a certified pen-tester, ethical hacker, or anything like that (although if that interests you, absolutely go for it, and more knowledge here in more people is going to help the entire industry). As with all things, the more you know, the more you can contribute, and the more you do it, the more natural and reflexive it becomes. The more central security is to your current (or desired) profession, the more attention you should pay to it. Security MUST be a consideration for all IT people, no matter your level or specialisation. </div><div><br /></div><div>What you DO need is three things:</div><div><ol style="text-align: left;"><li>Firstly, get a decent and thorough grounding in information security. I'd argue CompTIA's Security+ is enough to start with, reasonably cost-effective to find learning materials for and certify in, and anyone with more than 3 years of IT experience should do it if they haven't already. </li><li>Secondly, you need to build from that - keep an eye on even the popular online press, follow some infosec people online, have conversations about this stuff, and you'll see what the big exploits and threats are; add those to your mental database of nefarious ways of pwning expectations; make sure to pause and reflect on how and why they worked, and what the mitigations are or could have been applied (not only "fix the broken code"). Concentrate first on your most "important" job area(s), then start to learn about those areas that interface with that - and then another degree of separation out at least. </li><li>Thirdly, develop an appropriately devious mind! Think how <i>you</i> could bypass the assumptions you've made to secure your network/app/business process/etc. - and plug those holes. Rinse, repeat, <i>ad nauseam</i>!</li></ol></div><div>You must then marry these three key areas, extracting the devious thoughts of the second and applying them to your knowledge from the first, synthesized into the devious mind of the third point! Look particularly for results of the "<a href="https://en.wikipedia.org/wiki/Unintended_consequences">law of unintended consequences</a>" - what is the allow rule you're configuring<b> actually </b>allowing - is that <i>really</i> what you expected; is it actually <b>more permissive</b> than at face value? Is there a trivial/effective/realistic (or is it very damaging if they do find a bypass) way for someone to bypass or exploit that assumption? Here, you're only really going to catch this kind of problem once you've got at least a basic understanding of what the underlying protocols are doing, what the filter rules are examining (and what they are <i>not</i>) - and, most importantly, what your assumptions are for traffic - including how it will handle traffic that is either intentionally or unintentionally "odd". If you thought learning the multitudinous parts of a tcp/ip packet or an Ethernet frame in detail was kind of pointless rote learning, here is exactly where that level of knowledge starts to pay off. </div><div><br /></div><div>Obviously, it's worth adding a bit of risk assessment and management, with the traditional likelihood-times-impact formula and assess those results and guide your efforts, but good risk assessments may be quite hard to give objectively if you don't understand <i>enough</i> about the technology being assessed or the techniques that may be used against you. Remember, a business can always over-ride a risk with (more or less effective) mitigating controls and management decisions that supersede best practice or your advice - but make certain this is formally adopted at the right levels, and that your informed considered opinion is noted, as relevant (particularly if you say "don't do this, because of these reasons" - sadly, professional CYA is necessary at times). Audit risk logs are handy records here (super-privileged information, because they'll usually highlight ALL the known exploitable holes in your infrastructure). No organisation is perfect, but you need to be able to live with delivering the best possible information on which to assess and manage risk, and provide mitigations where that specific risk can't be otherwise eliminated. </div><div><br /></div><div>Traditionally, IT ran on a "secure the boundaries" model - simplistically, the people looking after the firewalls were trusted to get it right. That's not going to work in a cloud-centric, "borderless" world - security has to be baked in at every level of your organisation, from the first tier helpdesk tech that resets passwords up to the highest levels of your management structure; as far as possible, all users of your services should also have sufficient awareness to not run straight into the security equivalent of a burning building, shouting "<i>YOLOOOOOOooooooooo.......!</i>". </div><div><br /></div><div>Whilst there was an early paradigm of "Be conservative in what you do, be liberal in what you accept from others" (often reworded as "Be conservative in what you send, be liberal in what you accept") (<a href="https://en.wikipedia.org/wiki/Robustness_principle">Postel's law / Robustness Principle</a>) this may end up causing you considerable grief, particularly if this hasn't been rigorously implemented by those elsewhere in your infrastructure "stack"! Also, you can "accept" something and throw it in the bin if it fails subsequent tests, but the more (necessary) levels of careful examination everything goes through, the less likely a lapse in one of them leads to problems. Conversely, unnecessary layers things pass through are an expanded attack surface and may represent a net decrease in your security. </div><div>In constructing internet services, permissive allowance tends to lead to a world of hurt these days. We've certainly widely moved to "default to deny" stances in things like firewalls - long ago, by necessity. "Assume compromise" is also an increasingly common mindset - how do you detect and then resolve that?</div><div><br /></div><div>Think about all the attacks that have resulted from crafted packets - things like the <a href="https://en.wikipedia.org/wiki/Ping_of_death">Ping of Death</a>, <a href="https://en.wikipedia.org/wiki/LAND">LAND</a> and <a href="https://en.wikipedia.org/wiki/Smurf_attack">Smurf attacks</a>, against which most systems are now long since hardened. There will doubtless be subsequent incredibly obvious-in-hindsight attacks demonstrated in future. You need to use these examples to start to examine your assumptions about how you're handling traffic/data - no matter what part you play in the professional IT field, or even - to some degree - as an "end user". </div><div><br /></div><div>I vividly recall the first "real" hacker I met - at school in the mid 1990s. The guy was about 5 years younger than I was, writing his own operating systems and OS-like frontends to DOS, and had the kind of devious mind that asked "what if" questions about EVERYTHING. He showed us the hilarious bypass of the school library's brand new "anti-theft" system, based on RFID-like tags. When you legitimately checked out a book, there was an insert you put into the pocket with the anti-theft tag in it. It turns out that the anti-theft tags in the books <i>cancelled each other out</i>, so long as you aligned the tags right next to each other (just being in general proximity wasn't enough); as long as you "borrowed" multiples of two, well, checkouts were for dummies. Simples! Of course, we responsibly disclosed this to the horrified librarian, who then swore us to secrecy. That was a large number of years ago now, and the library was since moved and refurbished, so, well, hopefully they've sorted that out. I have long taken examples like that, my own attempts to get around things (as a thought experiment), and the exploits I've heard about to richly inform ways of raising the bar for others to exploit my systems! Working in schools finds you a lot of inquisitive teenage minds with time on their hands, many of whom are only too happy to show where your assumptions fail (Universities, even more so)... Obviously, if you work in some industries, the stakes are much, much higher. </div><div><br /></div><h1 style="text-align: left;">Do we examine the assumptions of what we are allowing carefully enough? </h1><div>I'd argue that we often don't, from two perspectives - </div><div><ol style="text-align: left;"><li>That we often think other people know more than we do, and are therefore probably right; </li><li>We look for reasonably quick fixes to problems to move on to solve yet more of our never-ending to do list - sometimes basic functionality is "good enough" - but is that basic functionality achieved dangerously? </li></ol></div><div>The key insight you need to pick up from pen-testing and hacking are that people can modify packets or other data in unexpected ways, and do not necessarily follow your approved or expected way of doing things. </div><div><br /></div><div>Make sure you're not basing the security of key infrastructure off assuming people won't specially craft (or won't intentionally mess with) packets or do things in odd ways. </div><div><ul style="text-align: left;"><li>What assumptions have you made? </li><li>Does that assumption fail safely - i.e. if someone messes with a packet in a particular way, does your filter still work, does it do what you expect it to, or does it result in unintentional remote exploit? </li><li>What are you REALLY saying with each and every security rule you put in place and the order they are encountered, and the order in which they are evaluated? Why is your infrastructure constructed that way? Is there a better way? </li><li>Are you regularly reviewing security controls and infrastructural decisions critically (not only seeing that they are still needed or signed off on, but that they haven't got an unintended effect)? </li></ul></div><div>Certainly, you need to move towards models like zero trust - but remember, each time you open something, you're granting at least some trust to something else - you need to watch out for when that something isn't quite what you expect! </div><div><br /></div><div>Bottom line: challenge everything, particularly anything to do with security. Never trust anything anyone else has written, unless you're either willing to accept the risk, or "audit" every single line for it to actually meet your intended purpose - and_no_more (we had a BGP policy called and_no_more, which denied all, explicitly called after all the previous export policies to allow what we definitely wanted in the ways we wanted it - and avoid any inadvertent leaks). I've been in several organisations where even if there is an implicit deny all at the end of a firewall on that platform, there is an explicit one there, too. </div><div><br /></div><div>A likely common oversight: We are probably <a href="https://insights.sei.cmu.edu/sei_blog/2018/04/best-practices-and-considerations-in-egress-filtering.html">far too permissive with outgoing filters</a>, in many cases, and it is often only when we're concerned with <a href="https://en.wikipedia.org/wiki/Data_loss_prevention_software">DLP</a> that we really start paying attention to the garbage that goes out of our networks. A basic standard should be things like blocking outgoing tcp/25 from client network addresses that are not sanctioned MTAs, contacts to known C&C networks, and making sure spoofed packets aren't leaving our network (e.g. uRPF or other controls that amount to that). Stricter than that often makes sense - if there are protocols that should NOT be hitting the internet (SMB, anyone?) drop them! This is obviously much easier at network edges that through global tier one networks, and is best controlled at or near customer edge nodes (if not already done between trust domains internally).</div><div><br /></div><div>This doesn't mean you have to throw everything other people write in the bin - it calls for you (and your colleagues, to cast more eyes - and brains - over it) to carefully assess each change you're thinking of making, and that you need to recognise everyone is human and errs from time to time; this particularly stands for any configurations you're borrowing from elsewhere. </div><div>You're probably not going going to (be able to) audit the code of the operating systems you rely on, nor many of the software programs, and few people even have sufficient expertise to do that - but you should question your assumptions in particular about how and why you are configuring security devices and policies in particular way, and take steps to secure the "human element" in particular. </div><div>Definitely spend some time thinking about this in code or scripts you write. </div><div>If you lurk in service provider communities, you'll find people complain about how hard it is to secure router control planes, and that even the best available filters people share have gaps or gotchas. Whilst running a more or less airgapped management network is fairly easy within a campus or individual datacentre, it's much harder to do so across an internet-scale network (partly because of cost, partly because of complexity - achieveing parallel robustness of customer circuits AND control circuits is expensive, and compromising on management resilience is shooting yourself in the foot where you can't simply pop downstairs and poke things). Scale and complexity are potent underminers of security - both because the more moving parts there are, the harder it is to keep everything as secure as possible (assumptions about who is doing what between or within teams are a common human failing, as well as it simply increasing the difficulty and potential attack surface) and, of course, because larger things are more interesting targets. It's a big field, but you can eat an elephant - one piece at a time. </div><div><br /></div><h1 style="text-align: left;">Whereto from here? </h1><div>Quite who needs to police these things for the good of the Internet/IT ecosystem is an interesting question, but we can safely assume that some time in the future, if we don't get our own house in order, policy and regulation will take the place of elective professional standards and norms. Keep this in the RFC and BCP contexts, not in the realm of law, regulation and policy as long as we can! </div><div><br /></div><div>Considering how critical the Internet is becoming (more utility-like every day) this also perhaps starts to beg the question of when IT itself will become a more regulated profession like accounting, <a href="https://www.theatlantic.com/technology/archive/2015/11/programmers-should-not-call-themselves-engineers/414271/">actual engineering</a> or medicine, with expected and more or less enforced standards of knowledge and ongoing professional development (within the limits of human error). You certainly have it within your own power to ensure your actions and (usually) those of your colleagues are carefully assessed for their assumptions, and tested in appropriate ways. Diversity of thought here can be very valuable - but obstructionist "no, because security" on everything can go too far and ultimately undermine your efforts. The rise in people calling for "<a href="https://en.wikipedia.org/wiki/Antifragility">anti-fragility</a>" and the expurgation of "<a href="https://en.wikipedia.org/wiki/Anti-pattern">anti-patterns</a>" strongly echoes that we're moving from where IT was in many senses a pioneering toy, to one where it's become essential infrastructure - like highway bridges, hospitals, water, landlines, gas, aviation, sewerage and electricity, which if they were managed in a "move fast and break things / live life in beta" manner would almost certainly have the world looking rather different than it does. The key difference? Arguably, licensed professionals and rather different attitudes and tolerances to failure and breakage. </div><div><br /></div><div>IT in may ways remains a somewhat "apprenticeship" based industry - you might do some reading here and there, and you may do courses, degree, certs and so on, but at the end of the day, much of what you do is emulation of those who have gone before you, picked up working side by side with them (or inheriting their code or infrastructural decisions over generations!). This is why we place a big premium on "experience" - the education arguably isn't rigorous, in depth or practical enough to replace years of "on the job" training and seeing first hand what works and what does not - and adjusting to individual little idiosyncrasies unique to every organisation and "their" way of doing IT. The seniority in our teams comes as much from having seen first hand why things are the way they are, and why things aren't another way, and being able to predict (with varying degrees of confidence) how further change might perturb a "stable" system - and perhaps decades of thought, learning and experience that leads to "gut feels" that work surprisingly well. </div><div><br /></div><div>"Or equivalent experience" is a key phrase here. You are not going to be allowed to design and build huge bridges because you have 15 years of experience laying rebar and concrete for them - you will need a civil, mechanical or structural engineering qualification (depending on what you're designing!), and progressive experience of designing and project managing construction of bridges that don't fall down - and membership to a professional body that enforces standards. There are vaguely similar computer-related bodies in some countries, but I don't think they are yet anywhere near as rigorous as those for more "traditional" professional qualifications. We should expect they will become more like those traditional professional bodies, we should have input into making sure that they are highly regarded and rigorous marks of professional competence. As IT becomes more and more central to "everything" we do, the stakes are ever higher, and our professional responsibilities, <i>must</i> grow to meet those expectations. A key danger we need to guard against, I think, is people who get too far firmly astride the <a href="https://schoolsysadmin.blogspot.com/2020/07/dunning-kruger-and-learning.html">Dunning-Kruger summit of mount stupid</a> - people need to move off that lofty peak and into areas where they learn and grow, and get the holistic experience they need to excel. I've certainly been there...</div><div><br /></div><div>We must also recognise that moving to professionalise IT in this way also raises considerable barriers to entry from under-represented groups and for people without significant financial wealth. Much of IT may end up being shut off to those who cannot afford a multi-year higher degree, and more or less "intern" early professional practice requirements (with all the challenges that brings where it is un(der)paid). As we consider fighting for professionalising IT, we should also fight for mechanisms that ensure ongoing social justice - at the very least within our profession. At the moment, grit, determination and a little luck can get you on board a promising career train; it would be a shame to completely lose that path to self-improvement and professional growth. There are few other careers that offer such stratospheric potential from modest beginnings.</div><div><br /></div>
<blockquote class="twitter-tweet"><p dir="ltr" lang="en">Every job in “IT” is an “IT Security” job. To be aware of the context of your decisions on attacker abilities is universal.<br />At the highest maturity there should hardly be an IT Security department. It’s everybody’s job to provide operational assurance. There’s no delineation.</p>— Swift⬡nSecurity (@SwiftOnSecurity) <a href="https://twitter.com/SwiftOnSecurity/status/1281740099697496065?ref_src=twsrc%5Etfw">July 11, 2020</a></blockquote><div><br /></div><h1 style="text-align: left;">Thinking & Learning works everywhere...</h1><div>If you needed another reason to develop such knowledge and patterns of thinking, that same knowledge and method of application makes you a MUCH better troubleshooter and implementer - and the more of the infrastructure you understand, the further those insights stretch, and the better you can deal with weird interactions between all the parts. Eventually, you will - kicking and screaming, perhaps - realise humans are one of those parts and have to learn about them, too!</div><div><br /></div><div>Learning, thinking and experience are synergistic; the more you have or do of each of them, the better you are able to find solutions (and problems!); this whole is more than the sum of its parts. This is particularly the case where you spend some time engaged in <a href="https://www.enhancementthemes.ac.uk/docs/ethemes/student-transitions/critical-self-reflection.pdf">critical self-reflection</a> to reinforce learning and discover gaps and things you're not good at. You may also need to develop a good ability to see when you need to hand over to someone else - and discover where you find and cultivate those people, but that's another topic; in brief, teams, communities of practice and diversity all enrich what we can collectively achieve.</div><div><br /></div><div>If you're consistently the "smartest person in a room", you need to go and find some other rooms to hang out in, because you'll learn a heck of a lot more that way. If you feel like the dumbest person in the room, it can actually be very motivating to get a significant helping of Clue, ASAP. There are forums that seek to do precisely that - find them; they tend to exist in hacker communities and professional networks. Find the right ones, and you'll pick up on all the latest norms, some cutting edge practices, and a rich history of skeletons that explain why the world is the way it is. Respect those gatherings of minds, and make sure you observe their written and unwritten rules about respect and confidentiality. A lot of what is said is within a certain professional "circle of trust" that is understood will not really go beyond the borders of those walls (virtual or otherwise) until it is appropriate (if it ever is). </div><div><br /></div><div>Obviously, it is impossible to literally see <i>all </i>the talks and read <b>all</b> the things, but you need to leverage the key concepts and principles that are revealed to you as much as you can - and then go out and get some more. Just like you need to eat and drink every day, you need to learn and think a bit every day too in order that your brain is sustained. If you can find a topic that is so interesting to you that learning it does not seem like work, but is something you would rather do instead of some other leisure activity you'd normally engage in, then you have been blessed with an excellent hobby. If you're exceptionally lucky, those interests will align with topics that are useful to your career, too. Sure, you can do some "learning as a chore", but "learning as fun" is much more sustainable, and, I'd argue, the main type of learning you should do outside of work hours. We've all been grabbed by excitement about a project, looked at the clock, and realised it is somehow 3am (the last time that happened to me, I was trying to learn how to do something I was doing in Bash in Python instead). Watch out for burn-out if all of your hobbies and leisure hours are indistinguishable from aspects of your day job; outside of a tiny minority of people, this does not end well!</div><div><br /></div><div>I'll leave you with a further industry perspective on this: </div>
<div><blockquote class="twitter-tweet" data-conversation="none"><p dir="ltr" lang="en">Earlier today I was on a small conference call with a peer. And they said this other team didn’t have the knowledge or contextual awareness to make a decision.<br />So I asked, “Why are you and I qualified but they’re not?”<br />The other tech thought for a bit.<br />“We’re always researching.”</p>— Swift⬡nSecurity (@SwiftOnSecurity) <a href="https://twitter.com/SwiftOnSecurity/status/1281736485906059271?ref_src=twsrc%5Etfw">July 10, 2020</a></blockquote> <script async="" charset="utf-8" src="https://platform.twitter.com/widgets.js"></script></div>James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com0tag:blogger.com,1999:blog-5949954951402579361.post-29676193233480429222020-08-05T16:14:00.038+01:002020-08-28T11:22:48.025+01:00Interview / job application preparation<div>I’ve sat on both sides of the interview table several times. I certainly don’t think I’ve mastered either end of that game, but certainly, there are some common key things you need to think about before you submit a CV and again before you hopefully head into a job interview...</div><div><br /></div><span><a name='more'></a></span><div><br /></div><h1 style="text-align: left;">Reflect on the position</h1><div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2mgflcamJRokoiEJ3xaUPDMEBEPrOYvuqXZLZwOOX7s3ik87TFQ-ZfxwVhhtuAFVpSZ0Jn5B6GL5cgpt3HgDEwVXUgMfJLEi1EGZhude31EvSYTGwWum6r8dAnxZREGt28p18qSNYrD8/s1050/photo-1487528278747-ba99ed528ebc.jpg" style="margin-left: auto; margin-right: auto;"><img alt=""For Hire" sign" border="0" data-original-height="700" data-original-width="1050" height="267" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2mgflcamJRokoiEJ3xaUPDMEBEPrOYvuqXZLZwOOX7s3ik87TFQ-ZfxwVhhtuAFVpSZ0Jn5B6GL5cgpt3HgDEwVXUgMfJLEi1EGZhude31EvSYTGwWum6r8dAnxZREGt28p18qSNYrD8/w400-h267/photo-1487528278747-ba99ed528ebc.jpg" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><a href="https://unsplash.com/photos/fY8Jr4iuPQM">Picture by @clemono2 on Unsplash</a></td></tr></tbody></table><br />Obviously, every position is at least a little unique. Spend some time thinking about what the “killer features” in a candidate for that position might be and figure out how you can best display them in yourself. Be very sure you’ve exhaustively pored over any job or person description they’ve given, and that you’ve reflected (at the very least) on the “essential” attributes, and wherever possible, on the “desirable” ones. Some people state that job descriptions are a wish-list and not a formula that must be met, but the degree to which that is true depends on a number of things, including the size of the applicant pool, the closely related desirability of the position and whatever “gatekeepers” you might need to pass (automated CV keyword searches, people that don’t understand transferrable skills within your specialised discipline, and so on). The more relevant you look on paper in the early application process the better, but make sure you don’t oversell yourself, and certainly, never lie. </div><div><br /></div><div>The documents they make available are quite literally your “cheat sheet” on what they’re expecting the job to be and how to identify yourself as the best candidate – but realise that some job descriptions can be quite generic (and all too often rather vague), or even a template HR downloaded from somewhere, and the job may end up with a somewhat different set of actual priorities (but it’s unlikely they’ll be completely outside of what that documentation says, at least in larger organisations). If you see a really weird set of criteria, one of two things may be happening – either they’re a genuinely unique place with an unusual set of requirements, or sometimes that they’re trying to hire a specific already identified candidate whilst meeting the letter of their hiring policies. Avoiding that latter behaviour is one reason job descriptions/specifications are quite vague and generic, and often don’t quite meet up to what you do in the role – and fear of this happening is why HR tend to take over the role of attaching job descriptions/specifications to roles, rather than departments or line managers. </div><div><br /></div><div>In some instances (from what I’ve seen this is quite rare) you might be able to get hold of an actual human at the organisation that has some interest in filling the position. If you see “for more information, contact…” do so! It can be worth taking this route and having a conversation about the position and, particularly, some of the key challenges or opportunities in that role. Definitely pin down the key priorities within the job description. It may also help you (early on) find any “red flags” for you about that organisation, and also means at least one person may recall having talked to you before and seeming like a decent prospect – so treat any such conversations as semi-formal interviews, because they may have influence during the hiring process. As much as people are supposed to ignore things they know aside from the CV in front of them and what the candidate said in interviews in the interest of a “fair” and “transparent” recruitment process, human nature tends to ignore that and use all available information – and, all too often, “gut feel”. Don't lose out to other candidates that have that advantage simply because they picked up the phone a few weeks ago!</div><div><br /></div><div>Turn the table around completely – imagine you’re assessing CVs or interviewing candidates for that position – what do you expect to see, and which are the most important factors to you? (I realise you don’t have a panopticon to see what that specific organisation needs right now, but with some experience, you’ll have a general idea of what most organisations actually need). Make sure you're presenting those qualities!</div><div><br /></div><div>How can you best meet those requirements and – vitally – conclusively and very concisely demonstrate that you do? </div><div><br /></div><h1 style="text-align: left;">What makes a good CV, anyway?</h1><div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjFsGaXDi4okfmESuPRR1IjSDVzYaT6Q_Efg2PfLWerf-a0GwTNhMVqJw3F1vwqExq0mwn3z6wNVEqWsHSn4yHsN7DGW7BfcwsAKq6jS4as8_LJES9RqkH8e6QMxmKaGP4J2hz_K64bMVg/s1050/photo-1586281380349-632531db7ed4.jpg" style="margin-left: auto; margin-right: auto;"><img alt="Clipboard with "My Resume" on it next to laptop" border="0" data-original-height="700" data-original-width="1050" height="267" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjFsGaXDi4okfmESuPRR1IjSDVzYaT6Q_Efg2PfLWerf-a0GwTNhMVqJw3F1vwqExq0mwn3z6wNVEqWsHSn4yHsN7DGW7BfcwsAKq6jS4as8_LJES9RqkH8e6QMxmKaGP4J2hz_K64bMVg/w400-h267/photo-1586281380349-632531db7ed4.jpg" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><a href="https://unsplash.com/photos/7iSEHWsxPLw">Picture by @markuswinkler on Unsplash</a></td></tr></tbody></table><br />This is a moving target and standards and norms (and fashions) change. It’s worth seeking out topical advice on the sector you’re applying to in order to make sure what you present is similar to what they expect, and making sure that advice is still relevant – and relevant not only to your targeted industry sector, but the country in which you are planning to work (yes, this changes across sectors and countries!). Most recruitment experts suggest that not only should you tailor your CV (and particularly cover letter) to a position, but that you don’t simply want to list a bunch of skills; you want to list <i>achievements</i>. </div><div><br /></div><div>These achievements should be, if you’ll forgive me for slightly overloading the initialism often used for project targets- SMART. <b>S</b>pecific, <b>M</b>easurable, <b>A</b>ccolade, <b>R</b>elevant, <b>T</b>opical. Achievements in this context are Specific examples which you can (ideally) put a number (Measurement) to; focus on things that have brought you Accolades (or an internal sense of great Achievement) - i.e. their value or importance was recognised by others beyond "meets expectations"; they should be Relevant to the job you hope to have; they should be Topical to the industry and current trends in order to illustrate your work behaviours, continuing education and development, and show your insights into key business realities. You may get some good ideas as you do the STAR exercise discussed later. </div><div><br /></div><div>"SMART" achievements might be:</div><blockquote style="border: none; margin: 0px 0px 0px 40px; padding: 0px;"><div style="text-align: left;"><i>"Reduced costs of wireless LAN by 50% by selecting a different vendor." </i></div></blockquote><div>It's specific - what you did and how; </div><div>Measurable - there is a value there;</div><div>Accolade: your manager was pleased about it;</div><div>Wi-Fi forms part of your job, so it's Relevant;</div><div>and well, Wi-Fi is just Topical these days!). </div><div><br /></div><blockquote style="border: none; margin: 0px 0px 0px 40px; padding: 0px;"><div style="text-align: left;"><i>"Introduced an Ansible-based system of playbooks to move the network access layer to infrastructure-as-code, resulting in vendor neutral new edge switch installs, corrupt configuration restoration and upgrades taking less than 5 minutes, with version controlled configuration."</i></div></blockquote><div>Again, it's a specific example, with some measurement of the impact (5 mins); you bet you got some accolades around the office for that; it's super-Relevant and extremely Topical. </div><div><br /></div><div>Hopefully you see the pattern? </div><div><br /></div><div>Change a boring job profile from list of tasks, KPIs, skills (and a possibly rather irrelevant list of every software program you’ve used since you were twelve) into something that demonstrates what and how you did that is worth specifically noting, emphasizing and focusing on those that have at least a modest “wow” factor to them. </div><div><br /></div><div>Don’t just state you’re good at something, <i>back it up with some form of evidence</i>. If you say you’re “good at mentoring”, what is evidence of that? Did your team end up all getting promotions from the outcomes of your mentorship, or did 360 degree reviews bring this up as a big plus for you? If you did a major project, what happened to the business as a result? Did you save money in some way? Better meet or smash SLA targets as a result? How much? You’re a good communicator, eh? Prove it! </div><div><br /></div><div>Another question to keep asking yourself is a somewhat cynical “so what!?” – what about this statement makes you great? If I’m hiring a systems administrator, I’d expect them all to be able to do the things listed in the job profile, and I don’t necessarily want a list confirming that hey, you’ve had a similar job profile before (your career history, where you list your positions kind of tells the reviewer that in way fewer characters). Always qualify a statement in some way with some form of evidence. In a CV, you don’t have a lot of space, and you need to really focus in on the best ones. It may be better to have some longer lists of achievements, and select and include the ones you feel best illustrate your suitability to any particular position, staying within acceptable page limits. </div><div><br /></div><div>If there are some major “wow” achievements, consider elevating them to a “Key Achievements” section just above the chronological listing section – but don’t repeat them under the position where you achieved them, as you are then wasting valuable space. Some people are a little obsessive over page count limits for CVs; as most of them are parsed electronically first, you probably don't need to worry about page limits as much, but don't go overboard (and respect a limit if it is explicitly mentioned) - and more words means more keywords to stoke the interest of those algorithms!</div><div><br /></div><div>The more your career achievements follow this sort of pattern, the more impressive they apparently are. In some cases, it can be quite hard to put numbers to things you’ve done, but anywhere you can put some numbers about how you made things better or saved/earned money, do so – but be certain you can justify and defend any figures you give – do NOT make them up, and do not overinflate them. I’ve been given advice to push them as far as you feel comfortable defending (and, presumably, not so far that a referee would laugh or be very surprised when questioned about it - indeed, you should always send your referees your current CV and perhaps even a sample reference letter or suggested talking points). You may be able to put figures in from sources like ticketing (time to resolve without re-opening) or monitoring systems (uptime – but in the Age of Patches, years of uptime is a bad thing – but on time upgrades during maintenance windows are good. It’s unplanned downtime that is bad). It is worth regularly filing away achievements, like a time a customer, peer, co-worker or boss gives you a complement on something you do, and keeping some notes somewhere of major things you’ve done (indeed, a long-form CV may be a good place to keep such things, but make sure you cut it down to the best examples before you send out a CV). </div><div><br /></div><div>I find this a particular challenge, perhaps because I don’t think in those terms, or (my wife suggests) my internal standards are such that I don’t view things I’ve done as particularly noteworthy or remarkable. Also I can’t actually quantify the results of many of the changes I’ve made, because all too often some of the changes I’ve introduced are to have any numbers whatsoever, so there’s nothing to compare back to!</div><div><br /></div><h1 style="text-align: left;">You’ve got a few seconds</h1><div>Even if a company isn’t using one of those systems that attempts to use keyword mining to cut down on the applicant pool to select the most “promising” candidates, you often only have a few seconds to make an impression. I spend notably longer reviewing applications than I’ve seen other colleagues spend – they will often spend less than 30 seconds looking at an application before deciding which pile the applicant belongs in (at most, there are 3 piles “nope”, “shmaybe” and “interview”). So you want to make it really easy for people to find the key things they’re looking for in a candidate. If your application form, CV and cover letter don’t make it into that 3rd pile, you have no hope of landing an interview. Spend the time getting those "good CV" things right! </div><div><br /></div><h1 style="text-align: left;">“Stan” that organisation</h1><div>If you’ve not come across “stanning”, it’s a portmanteau of “stalker” and “fan”. Leverage your hopefully considerable information literacy to find out as much as you can about the organisation you’re applying to work at. This can pay off two ways; firstly, you can often throw some bait out in your cover letter, and secondly this is even most useful when you get to the interview stage. </div><div>This is where you can go beyond the job description and person profile and get a broader sense of that organisation and/or department, where it’s been, where it’s going and how you’re most effectively going to contribute to those successes. In many cases, you can glean a lot of information from websites (particularly of public sector organisations) – but careful and judicious googling will often turn up gems such as where their staff have presented the upcoming network expansion plans at a technical conference. Reflect on any of those! If the information is a bit obscure, it shows some really good sleuthing and a very positive amount of interest in the role and organisation you’re applying to. Look at broader patterns, trends, threats and opportunities in that sector, and how IT – and particularly you in that position – might serve to strengthen them. Be sure to bring those insights up in the interview in an appropriate question or discussion topic. </div><div><br /></div><div>Use any of those insights wherever you can. Your last chance of course is the “do you have any questions for us” part of the interview. If it hasn’t come up, try and find out a bit more about the role and critical immediate priorities you’ll meet. Ask them about some of the obscure details you uncovered. Discuss how your role might meet the mission and vision of the organisation, or the current challenges and hot topics of that market, industry or profession.</div><div><br /></div><h1 style="text-align: left;">Interview the interviewers</h1><div>I’ve always been hesitant to ask questions in the “so do you have any questions for us?” stage, but this can be quite useful, and so it pays to prepare some good ones – notably as some people think negatively of people who don’t have questions at that juncture! </div><div><br /></div><div>One thing that is annoying about modern recruitment is that you scarcely ever get any feedback, other that “yeah, you got the job” or “sorry, better luck next time” – or, all too frequently – radio silence! A key question you could ask that can help you going forward, notably if you’re not sure it went well or actively feel it wasn’t a success is something like “Is there anything that concerns you about my CV / experience / answers I have given today?” – or, if you feel that is too negative ask if “there is anything on your CV / experience / answers you would like more information about”. Vitally, this may well give you a little insight into where you’re perhaps falling a little short (take it on the chin and seek to improve that for next time!). If you feel the interview has gone particularly well, you may like to ask something more along the lines of “what you expect me to achieve in the first 30 / 90 days on the job” – having them think of you in the role might work in your favour, and also gives you some more to think about in terms of preparing for getting the role – or further experiences you need to seek out, and it may spur some further conversation that may clinch the deal in your favour. </div><div><br /></div><div>Many people talk about interviewing being a two way process – and if you have the luxury of choice, do take that time to ensure that you take that last chance to address any insights, questions or misgivings you might have developed about the role or organisation during your “stan” research phase, and find out more about things that matter to you in a position (organisational culture, benefits, training, career prospects, etc). </div><div><br /></div><div>In many organisations, there is not just one interview, but a series of them (and sometimes various tests). The further you go down that road, the more invested they are in you as a candidate. After a few interviews, they are normally looking for reasons NOT to hire you! Likewise, those are your last chances to find out about any policy, corporate or team culture or work/life balance issues that may be bad for you before it is “too late”. Make sure you understand what your personal limits and non-negotiables are!</div><div><br /></div><h1 style="text-align: left;">Be a shining STAR</h1><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4E4ZSGk9atRa2mE7pju2Ia1Ro4gVvczybXGka_IHUH0y47QMk3WPgQUms0wum9K_MzWQvMblgMQkaV8dMTffEOvqhWP2BthBRAU5dKWUmDX74wlJqVnJnvwLFsXRb9hbT54Y2zgb0-2M/s1050/photo-1515705576963-95cad62945b6.jpg" style="margin-left: auto; margin-right: auto;"><img alt="Night sky picture of milky way above hills" border="0" data-original-height="700" data-original-width="1050" height="267" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4E4ZSGk9atRa2mE7pju2Ia1Ro4gVvczybXGka_IHUH0y47QMk3WPgQUms0wum9K_MzWQvMblgMQkaV8dMTffEOvqhWP2BthBRAU5dKWUmDX74wlJqVnJnvwLFsXRb9hbT54Y2zgb0-2M/w400-h267/photo-1515705576963-95cad62945b6.jpg" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><a href="https://unsplash.com/photos/9wH624ALFQA">Picture by @denisdegioanni on Unsplash</a></td></tr></tbody></table><div><br /><br /></div><div>If you’ve spent any time at all trying to “hack” the recruitment process by learning about how to interview well or successfully apply for jobs, you’ve no doubt come across the STAR technique or something similar. </div><div><br /></div><div>In IT, we’re prone to listing skills, software or gear without any sort of measurable qualifiers of just how proficient we might be with them (other than perhaps some industry standard certifications). If you list Cisco ASR9000 on your CV, did you just look at it once or twice in a rack in the datacentre, or did you regularly execute commands on it? (I can confirm they can blow paper an impressive distance across a floor when first turned on, but as it belonged to a REN and not me, I never saw its CLI, although I’ve definitely “remote hands”-d optics into one). I would never list this on my CV, nor bring it up in an interview – but a lot of people (particularly at more junior levels) are prone to listing everything they’ve ever seen. Also, if you last used it 15 years ago or it’s obsolete, you can probably drop it off your CV (and, sure you might have been a wizard at autoexec.bat and config.sys in the 1990s, but they’re irrelevant today)! </div><div><br /></div><div>These exercises give you two things – firstly, a list of somewhat quantifiable achievements that you could list in your CV, and secondly, a script for answering a lot of common interview questions. The more prone you think you are to freeze up in an interview, the more you want to prepare, because if the answers are second nature to you, then you’ll typically freak out less and be less likely to “forget your lines” – but seriously, don’t treat this exercise as being an unchangeable script, because you don’t know ahead of time what the other actors are going to throw at you as a prompt. It’s more like high stakes comedy improv than reciting Shakespeare!</div><div><br /></div><div>STAR should help you form coherent answers to questions that should touch on a few key points that make sure you cover the key points they’re looking for in a format that is easy to follow. The panel need to understand your thought processes, and they like that broken down into a few key (spoken) paragraphs that are nice and easy to follow. </div><div><br /></div><div>So what are interview panels looking for here? </div><div><br /></div><div>They’re going to be virtually marking you on a few key points. </div><div><br /></div><div>Firstly, are you actually capable of communicating with other human beings? You may even like to ask to what audience tech level you should address each of your answers (particularly if the advertisement says they want to you communicate to non-technical audiences)! If the interview is specifically billed as a "technical" interview, go deep, and do not be afraid of jargon or technicalities! Generally, you will find that each panel member asks at least one question – if you know who they are and what their function is, assume they’re asking those specific questions to assess you at their level – so if technical people ask you questions, be technical; if a generic non-technical manager asks you questions, tailor your answers to that interest/expertise level. I have a habit of immediately following this on with “does that answer your question adequately?” – giving them a chance to interrogate me further, or ask for me to expand on something. The length of the interview appointment should also give you an idea of roughly how detailed you can expect to need to be – although you don’t necessarily know how many questions they have up their sleeves. You also want to be very aware of rambling on somewhat aimlessly, and finding a careful balance between giving enough information, and being overly wordy, or disastrously incoherent (eek). You can also obviously ask them how much detail they want where you think you might be about to tie up a lot of time in something they wanted a sentence answer to. In exams, this is easy – you can see the marks allocated the question (or the amount of blank space to fill in) to guide your answer; in interviews, this is trickier, but often, the question itself gives you subtle clues as to how complex an answer they’re asking you to give.</div><div><br /></div><div>Secondly, they’re going to be assessing how you react to some “curve balls”. Most people think these are silly, but they come up a lot, so be prepared to handle them. The common advice is to simply re-frame them or, if they’re asking you for a negative character trait or incident, spin it to be a good thing. Have an answer for “what is your greatest weakness / mistake?” and so on. My go to, because it is true, is I find it hard to say “no” – but that leads into a quick discussion about time management and learning to set appropriate boundaries – and customer-centric (but appropriate) support. Always have a few stories about disasters, and what you learnt, and why that will not happen again. </div><div><br /></div><div>Thirdly, they’ll be looking to see how you “think under pressure” and that you come up with reasonable, considered and cogent answers to their challenges. Just like written exams, some people are good at this, and other people aren’t, but you can (and should) prepare and practice, because you can never demonstrate how well you do in the job until you ace this stage of the game. It is unlikely you will think up every single possible question and practice it – but if you practice a fair few across a broad family of questions, you’ll be reasonably prepared. </div><div><br /></div><div>Fourthly, they’re trying to figure out how you think – do you display a high aptitude to solving the kinds of business problems that they encounter, or, more relevantly, expect you to encounter in that role, and how you handle them. They want to make sure you can get the job done, and get it done without causing World War Three to break out between you and Janice in accounting – or causing a lawsuit, and get the work done without being micro-managed or leaning too much on co-workers or managers. Therefore picking (recent) business cases is better that “the time when you did <x> as a camp counsellor many years ago as a teenager” or something from a less relevant previous job or role – make it easy for the panel to visualise you as a professional working in their organisation in the job you’re applying for – and importantly in a tech-focussed role – that you’re regularly digging your organisation out of tech holes, or displaying mastery of the soft skills needed around high functioning technical departments! </div><div><br /></div><div>OK, great, so we now know why STAR is a useful thing. What is STAR again? </div><div><br /></div><blockquote style="border: none; margin: 0px 0px 0px 40px; padding: 0px; text-align: left;"><div>Situation(s)</div><div>Task(s)</div><div>Action(s)</div><div>Result(s)</div></blockquote><div><br /></div><div>Firstly, the interview panel need the scene (Situation) set out for them. “Everything was broken”</div><div><br /></div><div>Secondly, describe your assessing the problem (a skill!) – that’s the Task. “I needed to find out the <root cause>”.</div><div><br /></div><div>Thirdly, they’re going to want to know your technical Actions (possibly illustrating many skills). “I figured <root cause> out by <….>, and did <x> to sort it out, verifying this was resolved with <y>”. </div><div><br /></div><div>Finally, they’re going to want to know what the Results were. (i.e. do you manage to do the job, or prevent things from getting that bad again, etc). “After these actions, things weren’t broken any more, and I took the following steps to avoid <root cause> from happening again”. </div><div><br /></div><div>Although most people will advise you to have positive STAR stories, note that result doesn’t always have to be glowingly positive – as long as you demonstrate learning from the experience and your resulting experience went on to save bacon later! You can double the R to add “Reflect” which is where you can either tie the result back to the question, or show that you’ve thought about the repercussions in some depth. Another example is that in some cases, not making a change can be a good thing! </div><div><br /></div><div>If you can, try to avoid over-stretching the scenario – if the one you think of isn’t really going to meet the question, don’t use it – you can look like an absolute prat, or worse, come off as underhanded, deceitful or dishonest. If you literally have never experienced that, say so, but immediately go on to say something like “however, how I think I would react to Situation… is Task…, which leads to Action…, which I expect would Result in…”. You may also come up with an example that isn’t necessarily from your work life – that can be fine, if the question warrants it, or is outstanding enough that it makes sense to answer with that scenario. </div><div><br /></div><div>Remember if you had access to sensitive details, you want to be cagey around those – be up front with the panel and say “OK, this is the situation, but because <reasons> I can’t share all the specifics, however, broadly… <stuff>”. Be particularly cautious about anything that could be considered covered by an NDA, unprofessional to reveal, or deals with privileged personal information about others (the usual protected classes like age, gender, sexual orientation, religion, politics, health). If you’ve got a great example, but it involves one of those, it is by definition NOT a great example – choose something else. In any mid-level to senior role, you are expected to be the very definition of discreet – demonstrate that when you interview by not badmouthing co-workers, companies, vendors, etc – and don’t let privileged information slip. </div><div><br /></div><div>I’ll give you a few prompts to start thinking about this (mainly from a networking role perspective), but you should develop your own, and particularly, figure out how to “spot” likely questions from a specific job profile, and developing good STAR answers to them. You can find many lists like this online (go find some) – writing down answers to them is good practice. Better still, practice<b> talking </b>your way through it, and not by mumbling it at your desk - <i>deliver </i>those lines, superstar! Remember STAR makes an excellent framework to all sorts of behavioural interview questions, not just technical ones. Use them to frame inter-personal relationship questions from the HR panel member too (Situation: difficult team member, or boss; Task: getting the job done; Actions: what you did and why you did it that way; Result: you got the job done, and learnt <X>, and achieved <Y>)! If you want to take this to the next level, and you know that the things they are going to want you to handle as soon as you start (from your “stan” search) are things you’ve got good examples of, emphasize those!</div><div><br /></div><h1 style="text-align: left;">Some examples of questions to prepare for</h1><div>Here are some fairly typical examples of questions you might get asked (prefix them with “tell us about …” or “describe”… as appropriate). Obviously, tune your list for the roles you're interested in!</div><div><ul style="text-align: left;"><li>A time when you used your advanced troubleshooting skills on a network problem</li><li>A time when you used your design skills on a network</li><li>A time when you had to prioritize / time management</li><li>A time when you used information literacy</li><li>A particularly strange network / IT problem</li><li>A disaster</li><li>A problem you’re proud of solving</li><li>A network design you’re proud of</li><li>A security issue you found and fixed</li><li>An innovative solution to a problem</li><li>Dealing with organisational silos</li><li>A policy issue you found and fixed (not like group policy – a written policy like an AUP)</li><li>A situation where you couldn’t figure the problem out</li><li>Providing exceptional customer service – i.e. over and above expectations</li><li>How you work in a team</li><li>Your management / supervision / coaching style</li><li>A failing project and what you did to turn it around</li><li>A difficult situation (generic, but it comes up a lot – have both inter-personal and tech answers to this one)</li><li>How you handle conflict (have team, manager, vendor and customer examples)</li><li>How do you keep up to date, as technology changes so fast</li><li>Any other noteworthy achievements? (Make sure you have something to slot into questions like this, and another favourite “What did you not have room on your CV/cover letter you wish the panel knew about?”)</li><li>Be prepared to field questions about diversity and inclusion, particularly in markets, countries or organisations where these are important values. </li></ul></div><div><br /></div><div>Those are just a few to get you started on developing your own set of precompiled answers to what, on the spot having never thought about it, can be quite horrible questions! Another way is to turn each key or desired feature/attribute/skill on a job profile/description into a question and answer each this way. That reminds me, I need to go work on my answers to these… </div><div><br /></div><div>Another angle is your library of “dragon slaying stories”. Every technical person has a library of experiences from the technical adventures of their lives in the trenches. Get a bunch of techs together, and sooner or later, people will be talking about these stories – most techs LOVE to hear them, and many equally love to tell them. </div><div><br /></div><div>I'll emphasise this again - the only thing you must be careful about is being insulting about people or organisations (or letting slip anything confidential) – that can leave a very negative impression of you as a person. Remember, not everyone on that panel will agree that most (l)users are best hit by a clue by four until they no longer suffer from ID-10-T problems (and some will categorically disagree, not least because of the bad PR of having people BOFH the staff or customers). There is a line you ought not to cross! Remember, IT is ultimately a<i> service </i>function…! Similarly (and to reiterate this point), don’t expose sensitive or delicate information, but don’t be strangely evasive – note that there are limits to how much detail you can go into on a situation “because x” (ethics, NDA, propriety, and so on), and that you may have to not answer if they probe too deeply. This can be an excellent way to demonstrate discretion!</div><div><br /></div><div>Sit down and figure out one of these stories for each key point in your CV, and for each key point in the job profile or person description. If you’ve had to cut your CV or application down to meet arbitrary page limits, develop them for the points you’ve trimmed, too. Consider adapting those legendary stories into STAR format. Don’t borrow other people’s stories and pass them off as your own in an interview! Similarly, if you’ve listed quantified or qualified “achievements” on your CV, make sure you can back them up and recite/defend them as STAR talking points. Seriously consider writing them out as responses to each point, because it is quite common for you to immediately forget every great story you have under the pressure of a number of pairs of possibly unfriendly-looking eyeballs glaring at you across a room (or videoconference), and practicing concretely like that helps burn them into your brain, and makes recalling them easier (<a href="https://schoolsysadmin.blogspot.com/2020/07/read-it-note-it-redo-it-teach-it-how-to.html">much like note-taking when learning for exams</a>!). </div><div><br /></div><div>Get good at doing “STAR” responses to questions, as they’re a useful framework. Obviously, you don’t want to be unnecessarily wordy – if they’re actually looking for a very short answer, give one! Good interviewers should avoid yes or no questions, but you should also usually avoid yes or no answers – give your responses some substance! Be wary of putting your foot in your mouth – some interviewers think it is good to leave an uncomfortable silence hoping the interviewee will fill it with something unfortunately telling. If you can find a willing victim (or torturer?), get them to role-play this with you – ideally by throwing new questions your way, not a list you’re prepared (if you’ve done enough of this, you’ll end up in a position where you’ll rarely have an entirely out of the blue question). Remember you can always do two useful things – pause for a few moments and think before you open you mouth, and ask clarifying questions. STAR gives you the scaffolding to hang a reasonable answer off for behavioural interview questions, no matter what. I find it useful to think of key points, mentally stick out that number of fingers, and tick them off as I explain them to the interview panel so I don’t forget anything key. (Hmm, there are 5 things about that I need to do, they are…). You can easily claim 3-5 seconds of thinking time without it being too awkward. Sometimes you will be given a question you can’t immediately think of a real world example of, or even haven’t actually come across – here, still use STAR to describe the hypothetical situation, what you’re going to do about it, and what are the likely or possible results, complications, resolutions, etc. That can really highlight your thinking processes (in a good or bad way – make sure it’s the former). </div><div><br /></div><div>It is also worth using the considerable effort you probably put into this to further good use by using those same examples to flesh out the achievements listed in your CV – if you haven’t already done so – focussing particularly on the Results of your Actions. I’ve found this the most challenging part of doing what is apparently a better CV - answering the “so what?” when you list an achievement! Also, a lot of what I think are important things to have on a CV aren’t achievements, so you have to work to turn the raw ore of a list of skills into a refined and polished crown, diademed with specific and measurable achievements. </div><div><br /></div><h1 style="text-align: left;">What’s “good enough” to apply?</h1><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipXzVDQSrzVgvqLulzObvkgLwUlPW2Xi2f7bmmiGAByVmNppOrrplC3hclmI649MHbjDOvRdG7KmNsQvDx2aBdrgZIwqZK6TXxC0HaWIzTOweDWLWTEZULhBk3TDcSBVLE7VwotBblVp4/s500/photo-1531913223931-b0d3198229ee.jpg" style="margin-left: auto; margin-right: auto;"><img alt="Note book page saying "am I good enough" with pen and pencil in the margin" border="0" data-original-height="500" data-original-width="335" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipXzVDQSrzVgvqLulzObvkgLwUlPW2Xi2f7bmmiGAByVmNppOrrplC3hclmI649MHbjDOvRdG7KmNsQvDx2aBdrgZIwqZK6TXxC0HaWIzTOweDWLWTEZULhBk3TDcSBVLE7VwotBblVp4/w214-h320/photo-1531913223931-b0d3198229ee.jpg" width="214" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><a href="https://unsplash.com/photos/3xNn1zGvBwY">Picture by @helloimnik on Unsplash</a></td></tr></tbody></table><div><br /><br /></div><div>I’ve had conflicting advice from various people about this. </div><div><br /></div><div>A recruitment consultant has said (I paraphrase, of course): “if you don’t meet every single one of the essentials and probably all of the desirables, don’t bother applying, because they will get candidates who meet them all, and dump your underqualified CV in the bin”. </div><div><br /></div><div>When asked a similar question, someone in the industry laughed and said “Heh, they’re a wishlist – if you meet most of the major points, apply”.</div><div><br /></div><div>I’ve seen that it can be quite hard to find candidates that even meet the most basic requirements sometimes (specialised skills in an area that lacks qualified candidates). </div><div><br /></div><div> Exactly where on that continuum the truth lies probably depends on the market you’re in and the job role you’re applying for. If you have scarce skills in that market, and they’ve asked for scarce skills across a lot of areas of expertise, even if you don’t meet all the “core” requirements, you might still be the best candidate they see. If, however you see that 1,237 other people have applied to the position, it’s more likely they’ll get exactly what they’re wishing for, and your “taking a chance” CV will not get far.</div><div>From a “sound investment of your own time” point of view, you’re better off selecting jobs you’re a better match to, particularly if you take time to craft your applications to specific openings! </div><div><br /></div><h1 style="text-align: left;">Prepare for some tech questions</h1><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgJOrrIrdiOEyuBOT7kQWGfkud5cxUa1PVFw115oX86H_I56iJZ_bZiMdWpZHsi_NYYom3xs-jFzmnSrykDhmSGsFBtuIxskQ4siR7GziQBL-5Dr6iTeaSF3XkeB6o1QEpober8oJjNpAo/s1950/photo-1586772002130-b0f3daa6288b.jpg" style="margin-left: auto; margin-right: auto;"><img alt="Two technicians in a datacentre aisle" border="0" data-original-height="1300" data-original-width="1950" height="427" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgJOrrIrdiOEyuBOT7kQWGfkud5cxUa1PVFw115oX86H_I56iJZ_bZiMdWpZHsi_NYYom3xs-jFzmnSrykDhmSGsFBtuIxskQ4siR7GziQBL-5Dr6iTeaSF3XkeB6o1QEpober8oJjNpAo/w640-h427/photo-1586772002130-b0f3daa6288b.jpg" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><a href="https://unsplash.com/photos/3kgzvab8SMg">Picture by @scienceinhd on Unsplash</a></td></tr></tbody></table><div><br />In going for IT positions, you can expect the odd technical question (Surprised? Thought not!). Make sure you can answer any reasonable question that is likely to come up from reading about what the job will entail (notably if it is listed in the description). Have a plan for how to deal with curve-balls; use them to illustrate you problem-solving and information literacy, if nothing else. Spend a few days reviewing key technologies they’re asking for; the details you don’t regularly use often get rusty. You should be aware that some interviewers like to see how in depth you go – sometimes way further than they’d expect you to be in that position, which can leave you feeling like you failed that part of the interview, even if you actually nailed it. </div><div><br /></div><div>That said, the degree of depth that is reasonable to expect in a helpdesk technician is rather different from a senior engineer/architect position, so if you need to, hit the books to refresh your memory. If you’re going to be expected to work with a vendor, product or protocol you’re not that familiar with (assuming you then get to the interview stage…) come armed with some thoughts on how you’re going to adapt to it, and demonstrate existing progress to meeting their needs. </div><div><br /></div><div>It should be pretty easy to realise how to prepare for this - go look at the job description or any other information you have gathered about the role and think how someone might try to ascertain that you'll be able to do the job - what questions could you ask to see if the person lived up to the CV or a former job title - and particularly, can deliver on the one being applied for? If specific technologies or vendors are listed, spend some time refreshing your memory. You can even google things like "technical interview questions <role>" if you're stuck for inspiration.</div><div><br /></div><div>Not all tech questions should be answered with STAR – a straight “factual” question is not a STAR question – “describe the best path selection algorithm in Juniper’s implementation of BGP” does not warrant setting out an answer in the same way as “describe a weird routing issue you’ve seen” – make sure that you hit the key facts you’re expected to know, and perhaps highlight some interesting interactions or gotchas to show you’ve really “got it” on an operational level. How RPKI origin validation changes BGP bestpath is an interesting, topical example, as we think about BGP! </div><div><br /></div><div>Think about how to handle questions where you <b>don't</b> know the answer. Saying you don't know, but know how to find out (and demonstrating that) is quite valuable in employees, particularly when you get to upper levels of IT, where you are where the problem will sit until you resolve it. Again, sometimes, this is put in there to find out where you limits are, it is not necessarily a "game over" situation (unless of course you literally can't answer it, and being able to answer that is a requirement of the position; learn, address the knowledge gap, move on and nail the next interview). Treat this like any novel problem; explain what you would do if you hit that "on the job"; if you feel it's a deficiency, tell them (truthfully) how you're working on that skill or qualification, or how you plan to do so in the next <period of time>. Questions you can't answer (or feel your knowledge was inadequate) are always things you should immediately note after the interview and go and research. </div><div><br /></div><div>Consider using illustrations if they help – if you’re discussing a complex software architecture or network, draw a diagram, and illustrate it as you talk through it; don’t be afraid to refine and annotate this as you go along – in fact, do this wherever possible (practice doing so if you don’t regularly diagram things). Annotating diagrams is helpful to all involved. Indeed, if you can “spot” a question that lends itself to this treatment, prepare it so you look like a real pro when you effortlessly sketch it out on a piece of paper or a whiteboard and it isn’t immediately a huge mess – but don’t turn to one in a notebook you happen to have brought with you. A simple example is to be able to diagram the network of the last place you worked, or sketch out a workable system architecture for an organisation or ICT system based on a few key requirements, and be able to discuss and defend a particular design. Of course, if drawing a picture won’t help understanding, don’t spend time on it, either! If you think you’re going to do something “left field” in an interview, you can always say “To answer this question, I’d like to…. Is it OK if I do that?”. </div><div><br /></div><h1 style="text-align: left;">Are there any #lifehacks to get ahead of the curve? </h1><div>Maybe. Firstly, make sure you absolutely nail your initial application or other early approaches. Check if that company or industry “vertical” or job role typically has any strange standards and norms and that you live up to them. Get the basics right first!</div><div><br /></div><div>Some people say that the “unadvertised job market” is considerably larger than the advertised job market. I find this hard to believe, but it is plausible, particularly for more senior roles, that word of mouth and careful networking can get you ahead. I’m not certain how effective reaching out to random executives or managers (even with very nice stationery and a first class stamp) might be – but it’s a thing you could consider. If this interests you, research this approach (and no, this is not about randomly mailing out your CV!). </div><div><br /></div><div>You may be able to jump the queue a little, particularly if the advertisement says to contact someone to find out more about the position. You don’t want to hard sell here, but you do want to ask carefully considered questions that show your interest in the role, and perhaps nudge the person on the other end of the line into thinking about your suitability for the role. Obviously, this is your call (literally and figuratively). Then, make sure your CV crosses their desk, and they might just remember you from the phone call!</div><div><br /></div><h1 style="text-align: left;">Maybe you need professional help</h1><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_rwOGFH7VSmhJPZAH_rMbc4SdCaEr6LH79Ti71ufukmVYKj9Xv9r44Z2f5WiXgdgCHF0PUmoa7mI495zAWrtJD2WUW-SYy0tTM4-I9jeiLhP8Zy-tEkfK-me6zUDJv7cMQtF9gjtIj8g/s1050/photo-1454165804606-c3d57bc86b40.jpg" style="margin-left: auto; margin-right: auto;"><img alt="two people working together between two laptops and a pile of paper" border="0" data-original-height="701" data-original-width="1050" height="427" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_rwOGFH7VSmhJPZAH_rMbc4SdCaEr6LH79Ti71ufukmVYKj9Xv9r44Z2f5WiXgdgCHF0PUmoa7mI495zAWrtJD2WUW-SYy0tTM4-I9jeiLhP8Zy-tEkfK-me6zUDJv7cMQtF9gjtIj8g/w640-h427/photo-1454165804606-c3d57bc86b40.jpg" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><a href="https://unsplash.com/photos/5fNmWej4tAA">Picture by @sctgrhm on Unsplash</a></td></tr></tbody></table><div><br /><br /></div><div>Job hunting is a weird game. Think for a minute about what you typically do when you encounter a new field that is full of unknown skills, pitfalls and rules you're not sure about. You, as a technically minded person with a moderately healthy professional income, will generally go one of two ways. </div><div><br /></div><div>The first is to set out to master the topic on your own. This can take a long time.</div><div><br /></div><div>The second is to pay someone who has already developed mastery to help you out, which can really short-cut you to the "getting results" stage. </div><div><br /></div><div>The first pays off for things you're going to use frequently, or perhaps for which mastery is its own personal reward (things you need to do and things you like to do). </div><div><br /></div><div>The latter is better for "once off" or particularly "hard" skills or practices you don't have a good handle on - or perhaps don't even "get" - things you have to do, but only very rarely (or, of course, areas you might not legally be able to do things in).</div><div><br /></div><div>There are lots of people who will help you with various forms of career coaching, from basic CV drafting through to an in-depth career coaching service that may even go as far as considering large career changes and how to get there. Take a little time to think whether you feel confident in your job search, CV/résumé and interviewing skills, and if you might need some help developing those beyond reading a few articles on the internet and setting some time aside to think and act on those insights. </div><div><br /></div><div>Only you know what you need, but if you're finding it hard to get any interviews, and/or all your interviews go poorly, your game is probably not "on point" and you could possibly use some assistance. Consider money spent here to be an investment in yourself of a sort (much like any training or certifications you self-fund). If you're not getting interviews, you go through the process and get interviews, you're already seeing "ROI". </div><div><br /></div><div>Another way to think about it is how long it might take you to independently master the career game - and whether that time would be better put to some other use (like learning every day job-relevant skills). There is a cost/benefit curve here, of course, but on the whole, for something like this, I'd argue getting someone else to help you out, particularly if you're not confident in your job prospects, has enormous value. </div><div><br /></div><div>It can also be really helpful to have a "cheerleader" in your corner to reflect on the process and give you critique, encouragement and advice - and a sense of accountability for your job search process - and sometimes, even a network of people to plug into. </div><div><br /></div><div>If you feel this is something you might benefit from, have a look around and find someone who you can work with. I spent a fair chunk of change on a service like this, and I think I've derived considerable value from it. </div><div><br /></div><h1 style="text-align: left;">What’s the tl;dr of recruitment, anyway? </h1><div>At the end of the day, recruitment (from the recruiting organisation’s side) is a game that seeks to try to answer two major questions. </div><div><br /></div><div>Firstly, can this candidate do the job? </div><div><br /></div><div>Secondly, in doing the job, are they going to fit in (to the team and/or company culture) without causing us unwarranted (management) dramas? </div><div><br /></div><div>The person the interviewer(s) feel best answers those two overarching questions will be the “best” candidate, and the only one they’d offer the position to. </div><div><br /></div><div>To some degree, it is a pretty silly game – it’s very hard to conclusively answer either of those two key questions from a few bits of paper and what is generally a fairly short conversation, but there are lots of people that think they’re able to do that (conversely, there are those that spend several days getting to know candidates, but that is vastly more expensive to the company – and your time, too). To a huge degree, of course, that means people that have “hacked” (or simply naturally “get”) interpersonal relationships will tend to do better in interviews (and those that are naturally pre-disposed to being their own best salespersons and shouting their greatness from the rooftops do better than those who sit quietly in the background fretting under the grip their “impostor’s syndrome” – I naturally trend towards the latter end of that particular continuum, unfortunately). </div><div><br /></div><div>Each organisation (and even each team within an organisation) has a culture. Some will value the pure technical might of a candidate. Some will value the “soft skills”; most will value both. Work hardest on the thing you’re weakest on, multiplied by how important that skill looks to be in the marketplace.</div><div>This is also a reason why having more people meeting a candidate can be a good thing for the organisation – people have different skills in relating to other people, so the most technical person might be really good at figuring out the first question (can they do the job?), but might also be someone who would never pick up on the “creepy vibes” they hypothetically give to another co-worker, which could negate the interviewee meeting the second important question (do they fit in?) completely, and make them an entirely unsuitable candidate. You will therefore typically find that you will be interviewed by a panel, but all reasonable employers will let you know what the recruitment process looks like, and should tell you who the interviewers on your panel will be (or at least how many will be on it) and if there are additional phases or stages to the organisation. </div><div><br /></div><div>Don’t just take my word for it though – go and do some of your own research and reading!</div><div><br /></div><div>See also: <a href="https://schoolsysadmin.blogspot.com/2020/02/funemployment.html">https://schoolsysadmin.blogspot.com/2020/02/funemployment.html</a></div><div><br /></div>James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com0tag:blogger.com,1999:blog-5949954951402579361.post-56693639859780462552020-07-16T14:59:00.020+01:002020-07-16T23:06:42.968+01:00Read it, note it, (re)do it, teach IT: How to learn effectivelyIt's no secret that IT is a career in which you need to keep learning. <div><br /></div><div>It's also a given that many IT professionals are not given the time, space or resources they need to do this at work, so they end up doing it in their "spare time", to the detriment of other things they might otherwise like to be doing. Others are given this at work, but at a low priority, or in a half-hearted way where you have 8 hours of work a day you're expected to get done every day - sure you can spend some time learning, but you still need to deliver that 8 hours of what the business considers "work"! </div><div><br /></div><div>So it makes sense that no matter what we do, we ought to maximise the ROI on our "learning time"!</div><span><a name='more'></a></span><h1 style="text-align: left;">Lifelong learning</h1><div>I recently talked about <a href="https://schoolsysadmin.blogspot.com/2020/07/dunning-kruger-and-learning.html">how Dunning-Kruger applied to knowledge in IT</a>. Here, we'll be examining how we might most effectively navigate a path from the summit of Mount Stupid, through the chasm of the Valley of Despair, scale the treacherous Slope of Enlightenment and set out on our trek across the infinite Plateau of Sustainability - through better learning. </div><div><br /></div><div>As "knowledge workers", IT professionals have to get comfortable with the simple fact that they have co-opted a need for lifelong learning and professional development, and a high degree of information literacy. But how can we most effectively employ our limited learning time? </div><div><br /></div><div>There is, of course, research in neuroscience around learning - and forgetting. Here, I'm going to focus on circumventing the annoying "forgetting" circuitry that stymies those of us without an eidetic memory. </div><div><br /></div><div>Whilst <a href="https://en.wikipedia.org/wiki/Learning_styles">learning styles</a> are a somewhat contentious and mostly disputed model, it is helpful to understand how you prefer to learn (and it may be worth trying a few different styles to see how they work for you and picking the most effective, rather than the most pleasant). I hate audio and video for most technical topics. People ramble even more when they speak. They do so slowly; speeding the video up distorts the audio in various distracting ways. A lot of people have terrible diction or distracting accents (or terrible taste in background music). It's hard to skip past the irrelevant bits where they can't edit down the waffle (guilty as charged!). The playback speed doesn't intuitively react to my own processing of the words (unlike when I read). I <i>like </i>written words and (mostly) still pictures - I've spent something approaching 40 years training my brain to use this mode to onboard information (and yes, I had a stage where all I wanted to do was read "fact books", and a stage where I memorised the binomial Latin names of fishes). My wife often comments that I can't <i>possibly</i> learn the way I do - I read a book a couple of times, and that is my "learning technique" (I long for the days of my youth where I only had to read the book once...). I find actual dead trees books better than screens (kindle or laptop). </div><div><br /></div><div>It's possible there are better ways to learn (or "not forget"), no matter what <i>your</i> current method is. </div><div><br /></div><div>Let's examine some. </div><div><br /></div><div><br /></div><h1 style="text-align: left;">The neurological basis of memory</h1><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><img alt="human brain toy" height="288" src="https://images.unsplash.com/photo-1559757175-5700dde675bc?ixlib=rb-1.2.1&ixid=eyJhcHBfaWQiOjEyMDd9&auto=format&fit=crop&w=1000&q=80" style="margin-left: auto; margin-right: auto;" width="512" /></td></tr><tr><td class="tr-caption" style="text-align: center;"><span style="text-align: left;">Brain and neuron<br />By </span>@averey <a href="https://unsplash.com/photos/IHfOpAzzjHM">https://unsplash.com/photos/IHfOpAzzjHM</a></td></tr></tbody></table><div><br /></div><div>You may already know that memory is encoded in the interconnections of neuronal cells in your brain and the synapses between them. That getting a new memory requires novel re-wiring of parts of your brain. What seems to be the case with memory is the more those particular pathways are re-triggered (you "remember" and use those memories) the stronger they are. Rarely used memories decay. Some experiences never really get the chance to be long term memories. Really "significant" memories are stronger. For those with a programming background, our memories are heavily in associative arrays - we link things together with other things, and we most effectively learn and remember when we have several triggers (associations) for that memory, and have new information linked with something we already know. How often have you said or thought "oh, <i>that</i> reminds me..."? </div><div><br /></div><div>As capacious as your brain is for remembering things, there are limits to its capacity, and most people forget most of what they come across. There are also research thoughts that too many memories may be distracting or sub-optimal for quick decision-making. Remember, we're the product of billions of years of snap decisions around whether that thing over there necessitates one of the <a href="https://en.wikipedia.org/wiki/Four_Fs_(evolution)">4 Fs - Fighting, Fleeing, Feeding and Reproduction</a>. Relatively little fitness has resulted from a lot of sittin' around, cogitatin' for most of evolutionary time, and there is a large body of thought that our abilities in this regard are primarily the equivalent of a peacock's tail - with similar drivers related to sexual selection and the outlandish results that can have. We're wired for what worked for (thousands of generation of) common ancestors, not necessarily for the modern world. Witness phobias, which are almost always around things that exist in the wilderness - there aren't really any people who have a phobia around cars, or high voltage electricity - both more dangerous and more frequently encountered than venomous snakes or spiders by most people today. Circling back to the point here - your brain's mechanisms are all based on what worked over thousands of years, and that was mainly "chuck out most of this coincidental garbage and keep the real gems".</div><div><br /></div><div>This forgetful filter includes discarding things you might not want to forget, and you have little chance to circumvent that (although my brain seems to prioritise information where I think "hmm, that <b>is</b> interesting" - and were that happens, I will often at a later date recall that thing and having "read it somewhere"). This garbage collection doesn't pay much attention to "this is <i>work</i> knowledge and therefore inherently important". Sorry! </div><div><br /></div><div>Similarly, things we don't regularly use tend to decay. Can you remember all your past childhood home telephone numbers? I can remember some of them, including one from the ages of 6-10, in part because I mentally repeated it so often, in part because I remember the DTMF tones it made, and partly, because it is a pattern-y number (464 4550); indeed, when I think of this number, I get the mental sound of both my voice saying the number in a particular cadence, and the DTMF tones synchronised with the number, and even a mental picture of my finger hitting the buttons themselves on the specific handset. How do I know the DTMF tones? Well, I used to dial it from the home phone and listen to the cool sounds... I typed my credit card information into enough web forms that I memorised it. Similarly, writing in a bank account number monthly onto a form to be paid as a contractor. South Africa's obsession with ID numbers on everything means I remember that, too. Of course, sadly, not everything we need to know is interesting or stimulating, or used super-frequently. </div><div><br /></div><div>So there must be ways of hacking this system, right? </div><div><br /></div><div>Yes, but there are no short-cuts (much like in uncovering new exploits) - there isn't a script kiddie friendly version of Kali for your brain - you're going to have to put in the hours, one way or another! </div><div><br /></div><div>It bears stating that all of this is based on developing science and understanding, and on models of learning and memory that may be incomplete, or perhaps even wrong - so see how you go with each of these and how they work for you, and figure our your own personal bag of tricks. And do more research of your own on this, too! </div><div><br /></div><h1 style="text-align: left;">Repeat, repeat, repeat. Link, link, link. </h1><div>I've noticed that repetition, in all its various forms, seems to be pretty key to lasting memory. Research apparently suggests much the same. Things you don't regularly use decay. Things you use regularly are right there to be used. Things somewhere in between are harder to "access". </div><div><br /></div><div>Similarly, things that are linked to other topics, feelings, sounds, scents and so on are also easier to remember in the first place, and then to recall or trigger those memories or associated knowledge. </div><div><br /></div><div>So most of the neurological "hacks" are based on subverting the neurological machinery that sorts memory into the "keep" and "discard" piles. At its simplistic essence, this comes down to "things that are used often get remembered" (repetition), and "things that are linked to many other things are probably important" (linking); forming internal models of a concept and contextualising that knowledge (in terms of rationale and importance of it to us) also seems to help. </div><div><br /></div><div>Be aware that neural plasticity changes with age - kids are literally wired to be learning sponges; older people's brains start to loose this plasticity (the "evolutionary psychology" just-so-story runs along the lines of: Kids must learn! You, old person, survived this long, your current mental tool-set is good enough, change may be worse. Don't change!). </div><div>So yes, kick yourself for not starting this journey younger. And yes, I do find it considerably harder now than I did as a kid (where it was quite literally effortless, so long as I "saw the point of it"), but I'm still quite good at it, I think - but even by university, I found learning more effort than I did as a tweenager, and the dumbing down "lies to children" that we do because "kids can't cope with more than that" strikes me as a missed opportunity. Oh well. </div><div>A sobering, related point: if you're over 40, you are, for a human, extremely old compared to the average lifespan of humans over most of our 100,000 or so year evolutionary history; anything beyond that is a bonus, and, for your genes, somewhat uncharted waters. You've been an adult for a while! </div><div><br /></div><h2 style="text-align: left;">Repetition</h2><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><img alt="man in black pants under blue metal frame" height="640" src="https://images.unsplash.com/photo-1526283706298-03ba520844d1?ixlib=rb-1.2.1&ixid=eyJhcHBfaWQiOjEyMDd9&auto=format&fit=crop&w=1000&q=80" style="margin-left: auto; margin-right: auto;" width="512" /></td></tr><tr><td class="tr-caption" style="text-align: center;">Repetition!<br />By @serhatbeyazkaya <a href="https://unsplash.com/photos/6OmkdtxJzYE">https://unsplash.com/photos/6OmkdtxJzYE</a></td></tr></tbody></table><div><br /></div><div>An absolutely key feature of all effective learning seems to be repetition. Go over it enough times, and information sticks. This means you can expect to spend quite some time to really learn a subject. </div><div><br /></div><div>When you read someone else's journey to a certificate, and they say "it took me a year and a half of concerted study" and your think "but I can read the book in a day or two", those are not the same journey at all. The person who's lived with the material for months <i>knows</i> it. They've gone over the material several times. They've played with it. They've probably read several different books on the same subject. The key to professional development (not simply passing exams) is you need knowledge that sticks. This only comes through time and repetition!</div><div><br /></div><h3 style="text-align: left;">On making notes</h3><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><img alt="eyeglasses on white notebook" height="640" src="https://images.unsplash.com/photo-1520076794559-6a1229412a42?ixlib=rb-1.2.1&ixid=eyJhcHBfaWQiOjEyMDd9&auto=format&fit=crop&w=1000&q=80" style="margin-left: auto; margin-right: auto;" width="512" /></td></tr><tr><td class="tr-caption" style="text-align: center;">Notes, FTW.<br />from @grohsfabian <a href="https://unsplash.com/photos/GVASc0_Aam0">https://unsplash.com/photos/GVASc0_Aam0</a><br /><br /></td></tr></tbody></table><div>Good note-taking does several things. Not only does it provide you with a fully customised set of knowledge in your own voice, creating that content requires you to engage with the material in a different way, and a way that involves more senses. It really does seem to help you retain the content. It is, of course, a form of repetition, but a particularly effective one, it seems. </div><div><br /></div><div>When I listen to someone, there is a certain element of in one ear, out the other. Take notes while I do that, and the experience is much more transformative. Similarly, receiving a set of someone else's notes, hand-outs or copies of slides, is absolutely not the same thing as making your own notes, and has considerably less value. I'm often tempted to go "yeah, I've got the book though, I can just re-read it". <b>No!</b> <i>Bad learner!</i> Well, even stubborn just-read-the-book-learning me recognises that notes are effective. Similarly, highlighting passages makes the learning more multi-sensory, but actually (hand) writing notes is much more effective (and doesn't deface books). Talking of multi-sensory, remember my example of childhood memories, specifically phone numbers? Not only repetition, but visual (numbers on the keypad), and sound (DTMF tones). </div><div><br /></div><div>The least effective is if I try to transcribe what someone is saying by typing it out in near real-time - this is a straight ear->fingers matter, and I am absolutely not inwardly digesting the material. For some reason, attempting to hand write the same matter is a bit more effective, but summarised key point notes are much more effective. <a href="https://www.psychologicalscience.org/news/were-only-human/ink-on-paper-some-notes-on-note-taking.html">A small study</a> illustrates this quite effectively. A key difference between "traditional" note-taking and the attempted live transcription that seems to happen once you have a keyboard is there is a greater degree of active processing in taking notes on paper - thinking about the material immediately, highlighting key or important points - and often includes the random mental tangents that aren't even part of the original material, but are your reaction and personal context for the material. It is semantically much richer to the individual. All of that makes the subject matter much more "important" and sticky for your memory!</div><div><br /></div><div>Demonstrate this to yourself. Take a low stakes (i.e. free) course in something where you take notes, and a similar (but not the same, obviously) subject without taking notes. How well do you do in a final test on each subject? How well do you do if you repeat the tests for each <i>six months later</i>? Ideally, you should have some replicate trials, but I suspect one will be enough to demonstrate the point!</div><div><br /></div><div>Make notes. If you hate making notes, figure out some other way that you have to actively transcribe, collate, relate, contextualise and relay the information into different formats yourself. Try to force yourself to take notes anyway. </div><div><br /></div><div>How you take notes is entirely your own deal. It doesn't have to be a neat essay - indeed, for most people, a crazy mess of their own "hieroglyphics", arrows, underlining, circles, boxes, relative alignment and other features seem to work better - indeed, some notes look a little like mind maps (see later section). But what works for you is what you should use in the end. Experiment! Long term legibility may not even be important - it is the active learning process of the note-taking activity itself that seems most important, not that there is a reference to review. Indeed, approaching things you're taking notes on merely as initial guidance or an introductory framework for further research and study from other sources (much like university lectures used to be, and perhaps still are in some courses) is probably a good move if you're trying to achieve mastery. </div><div><br /></div><h3 style="text-align: left;">Re-certification</h3><div>Most IT certifications have ceased to be "lifelong" - if they ever really were. Whilst it can be somewhat annoying to have to keep forking over chunks of your hard earned to some soulless corporate behemoth, spending time studying, and subjecting yourself to exam stress, there is an upside. </div><div><br /></div><div>Firstly, the curricula evolve, usually for the better, and help you keep up to date with the ceaseless march of technological progress. Secondly, re-doing the same exam helps you to cement the details in your mind. Much like re-watching or re-reading favourite movies or books can bring out new elements, so does re-learning topics; you'll revisit the theoretical with practical experience; you'll make deeper connections; and see new context for how all the bits fit together. I'd certainly encourage people to move up the certification hierarchy, but there is value in re-doing your existing qualification if the higher one isn't for you just yet. And eventually, you can't go higher, so you'll be "stuck" redoing the "expert" level qualification exam. (Tough live, innit?).Well, congratulations on reaching the top, but there's still more to learn, believe it or not... By this stage, you might just reach the joyous circumstance where you feel you still know nothing and are still a bit of a fraud. If you do, congratulations, you've got <a href="https://en.wikipedia.org/wiki/Impostor_syndrome">Impostor's</a>, so harness that energy to get something great done!</div><div><br /></div><div>Perhaps cynically, part of the reason for the IT learning treadmill is, arguably, a need to introduce new features so they can sell new versions of the product - but you'll still need to refresh the basics, and re-certifying is a useful way of doing that. </div><div><br /></div><div><br /></div><h3 style="text-align: left;">Writing</h3><div>First of all, I'm not talking about notes. I'm talking about long-form writing in a semi-formal way - making documentation, creating blog posts or even writing book chapters or entire books. This doesn't have to be for public consumption, but you should put effort in as if it were, striving for clarity and insight (and being amazed when you deliver it). </div><div><br /></div><div>In the same way I like reading written words, I also like playing with them on paper myself. I find it's a good way to highlight areas that are "grey" in your mind. Setting out to put down a reasonable treatise on a topic tends to show you exactly what bits you're not too certain about, spurring you to discover up until then "unknown unknowns" and immediately turn them into "know unknowns" - which of course, being the diligent autodidact you are, will spur you to do the work to turn that topic into a "known known". </div><div><br /></div><div>It's not as effective as teaching, because your own sense of what is clear isn't always totally on point (*cough*) but it is a useful technique to help you cement thoughts. </div><div><br /></div><div>You can of course share this writing (the internet makes it rather easy to do so). Even if you feel you're "howling into the void", someone may read it, and it may have been your particular angle on the topic that finally made it "click" for them. Outstanding job. </div><div><br /></div><div>It also shows you a sort of measurable output or outcome of your learning that not only documents your understanding, but contributes meaningfully to it. </div><div><br /></div><h3 style="text-align: left;">Labs</h3><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><img alt="two brown dogs" height="410" src="https://images.unsplash.com/photo-1526479540275-62783538d64b?ixlib=rb-1.2.1&ixid=eyJhcHBfaWQiOjEyMDd9&auto=format&fit=crop&w=1000&q=80" style="margin-left: auto; margin-right: auto;" width="512" /></td></tr><tr><td class="tr-caption" style="text-align: center;">Lab every day!<br />By @wadeaustinellis <a href="https://unsplash.com/photos/NdZxzD9QlSY">https://unsplash.com/photos/NdZxzD9QlSY</a></td></tr></tbody></table><div><br /></div><div>For practical disciplines, nothing can possibly beat actually doing the discipline. You can theoretically discuss storage subsystems, or hypervisors, or the BGP bestpath selection algorithm until you're blue in the face, but mostly, people care about how that manifests in systems doing what they're supposed to do in practice. So it's pretty intuitively obvious that you're going to want to get down to the "doing" at some point - and that will typically be in some sort of lab environment. Scrounge, borrow or buy spare kit, or virtualised versions of the platform you're trying to master. You can find free examples of some, or pay for access to others, or build up your own lab, either as virtualised hardware that's become reasonably affordable to many because of the relentless march of Moore's Law. </div><div><br /></div><div>We'll cover this in more detail below under Practice, but doing and re-doing lab exercises is inherently "repetition". </div><div><br /></div><h3 style="text-align: left;">Different treatments of the "same" content</h3><div>It can be helpful to get subject matter content in a variety of forms or from different authors/creators. People have different ways of processing information, so it just might happen that you find some information just works better for you. Even in the absence of that, the repeated exposure to the concepts helps you to cement the information and better retain it. There may be higher value ways of "repeating" knowledge, though (labs, notes), but this is worth trying. </div><div><br /></div><div>Written word not working for you? Try video or audio; I don't find them that helpful, but you might be very different (or, perhaps, suffer from dyslexia, making reading a struggle). </div><div><br /></div><div>Here's another way to approach this - do something in a different way. Struggling to learn to program? Get an arduino and some basic passive electronic components and build some programmable circuits. A lot of people find the concepts click better when they can hold a physical artifact in their hands; it's more "real" than lines of code, somehow. </div><div><br /></div><div>You'll see a lot of these topics, when you step back and look across them at a whole (and use several of them) mean you're inherently doing different treatments of the same content. This is good for learning, not only because it's repetition and linking, but because the different "modes" of content delivery help you to potentially find meaning in different ways - and that also helps!</div><div><br /></div><h2 style="text-align: left;">Practice</h2><div>Ever noticed how many successful IT pros are encouraging people to <a href="https://twitter.com/search?q=%23LabEveryday">#LabEveryday</a>? There is a reason for that. </div><div><br /></div><div>Firstly, learning in production is... unwise. </div><div>Secondly, you can break things with no worries about problems, and see what happens for yourself. </div><div>Thirdly, lab everyday is practice, which is... repetition! </div><div><br /></div><div>You're continually reinforcing your knowledge of command line syntax and commands; what the outputs are like, and, vitally, what "normal" looks like. It's sad that this is mostly expected to be at home, and expects a certain level of income so that you can even afford to do so (most of the top platforms are not available for free, although things like <a href="https://jlabs.juniper.net/vlabs/">vlabs</a> exist, and you can often get a demo version of a software product or cloud service). It obviously makes the most sense to practice on platforms that you are using day to day or aiming to certify in, once you've grasped the basics. For the vendor neutral basics, of course, anything will do, although you may build up mental models that are strongly influenced by some of the unusual ways some vendors do certain things. You must be amenable to changing and extending those mental models!</div><div><br /></div><div>For example, Nortel/Avaya and VLANs is... backwards, in that you add ports to VLANs, not VLANs to ports like pretty much everyone else - but there is some elegance to this "other" way of doing it, in a way - but it stands out as "wrong", because everyone else does it another way - and, well, when I go back to a Nortel/Avaya (an infrequent event) it has historically taken me a while to remember the "backwards" syntax (confession: I usually ended up googling it). Now that I have the relatively recently discovered mental model "Nortel/Avaya is backwards", it's <i>much</i> easier to remember this fact and <i>way</i> quicker to get back to the right syntax (hey, I suspect if you ever get into this situation yourself, you might even think "wait, Nortel/Avaya VLANing is <i>backwards</i>, I read that somewhere" and get to the answer way quicker - possibly without google). This is also an example of <i>reframing</i>, as well as <i>linking</i> it to the rest of your mental model of VLANing. You will also find that people who learnt to VLAN on Nortel/Avaya possibly think everyone else is backwards (my first VLANing was on Cisco). Whether this was an architectural vision ("ports are members of VLANs, so you configure them under VLAN commands" - rather than the alternative "ports are tagged with VLANs, so you configure this under the (switch)port commands"), or simply trying to avoid getting sued for "copying" by a vendor with a logo that looks like a famous bridge who everyone knows does it the other way, I'm not sure! I suspect an awful lot of the slight variations in command syntax either comes from different conceptual models of how the various IT layers fit together and what belongs to what - or, sadly, probably more often - a keen desire to avoid getting sued for "copying" features. I suspect the trend towards (net)devops, cloud and Infrastructure as Code will lead to further abstractions here quite quickly, and a lot of people will never even realise that different commands are being run on different platforms. Until then (and for advanced users), you're going to have to learn the syntax (or perhaps pay the syn-tax? ;)) for each platform you use; different vendors have different degrees of pain here. </div><div><br /></div><div>If you regularly lab with different vendors (particularly if you use one vendor in your day job, but want to stay current with others), you'll retain that knowledge far better if you can keep "current" across them. It may be helpful to have some clear context switch when you're learning each vendor - something as simple as only doing one single vendor (or platform, where those differ) on a day, or having very different looking terminal windows or GUIs, or even doing different vendor labs from different positions in the room (or home). Similarly, the more time you spend doing anything, the better you get at it - particularly if you carefully reflect on what you're doing and actively seek improvement. We rarely learn multiple foreign languages in the same classroom or with the same teacher, helping us to separate out those languages. IT platforms are not much different from that!</div><div><br /></div><div>There's another big reason - practical, personal experience is way more significant in cementing learning than abstract knowledge. You remember it far easier if <b>you</b> type the instructions that watch a video of them being typed. The <a href="https://en.wikipedia.org/wiki/Socratic_method">Socratic method</a> of teaching, or variants of it (where you teach by asking questions people have to answer) is quite effective for more advanced groups (my A-level biology teacher taught like this a lot in our tiny 6th form top set of six people, and treated us more like 3rd year undergraduates than high school kids) - you have to engage your brain, and you're not just waiting there like a sack of potatoes for knowledge to be thrown at you to see what sticks. You'll remember tech stuff even better if you keep having to type in the commands without blindly following a "recipe". Most people expect you to be able to do most of the basic and intermediate commands from memory; the more advanced you are, the more is expected to be "at your fingertips" - which is why you get paid the big bucks. You can achieve more complicated things more quickly, and they will be more robust/anti-fragile. You've figured out what is needed, the best way to provide that, and already know how to implement it. You'll probably still test it out in some sort of test environment, because that is what professionals do. Similarly, you'll know the system well enough that you look at it in a troubled state, and things jump out at you as "odd", or you know how to get it to spit out the appropriate arcane diagnostics to really dig into the issue.</div><div><br /></div><div>Labs are great for experiencing the topology diagrams in books, or the actual behaviour (or a close analogue) of systems; to experiment with new ideas. To do the "what if" without causing P1 outages. All of that "playing" helps you to practice your day-to-day administrative skills, and build a richer personal context for the conceptual learning. It is always worth spending a few days with as close to the functional environment as you can get for testing "what ifs" before rolling out a new architecture. The bigger the change, or the less familiar the gear, the more helpful this is, particularly where there are differences between physical kit and virtualised platforms. Of course, there are many things that are hard to accurately model (full internet scale particularly, and full production loads in many cases). </div><div><br /></div><div>Also, there is no substitute for experience - having been there and done it (possibly even having got the t-shirt), you just have a larger mental experience base to apply to problems - and suggest better solutions from a place of experience, and can say: </div><div>"yeah, it works in the lab at small scales, but when you scale it up to production environment of X nodes, not so much". </div><div>"Oh yeah, I've seen this before, it usually happens when...". </div><div>"Ooooh, this looks like when X does Y because Z". </div><div>You might also have see more mixed vendor interaction behaviours, and hopefully have picked up nuggets of experience across IT, being able to give slightly more holistic answers to bigger problems.</div><div><br /></div><div>Beware vendor lock-in in your own training (balance this against the possibly huge amounts of effort invested in multiple vendor knowledge). If the only tool you have is a hammer...</div><div><br /></div><h2 style="text-align: left;">Linking</h2><p style="text-align: left;">Linking concepts together with others helps you to remember things. It's worth finding ways to help you to find links you can use to enrich these inter-connections and pathways in your own mind.</p><table cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><img height="272" src="https://alumnigroups.osu.edu/tuscarawas/wp-content/uploads/sites/83/2016/03/oval_aerial_2887.jpg" style="margin-left: auto; margin-right: auto;" width="597" /></td></tr><tr><td class="tr-caption" style="text-align: center;">Quad at OSU showing paved desire paths. <br />from <a href="https://alumnigroups.osu.edu/tuscarawas/wp-content/uploads/sites/83/2016/03/oval_aerial_2887.jpg">https://alumnigroups.osu.edu/tuscarawas/wp-content/uploads/sites/83/2016/03/oval_aerial_2887.jpg</a></td></tr></tbody></table><h3 style="text-align: left;">Wikis</h3><div>The most immediately obvious interlinked knowledge artifact is a wiki. They work very well for hyperlinking information and related concepts, and I really like them for technical documentation within technical teams, because they're easy to create, expand, and hint at missing information that needs to be filled in. They're best used as collaborative tools, and you may find value not only in using a wiki, but in extending one your team makes use of. You'll also find it a useful place to curate infrequently done tasks that take considerable time to work out the first time, but are done so infrequently that the next time you do it, you have to start from scratch - unless you've documented the process! Writing out clear instructions in there is a good step to being able to offload tasks to more new or junior team members, or a good stepping-stone to automation. It's also a low effort way to build up considerable documentation, and is considerably easier to zip around than a gigantic text document, or worse, a directory full of documents (even if you use search, table of contents or in doc hyperlinks). They're less useful to build up study notes, but if this works for you, by all means, use it. Certainly, with their tendency to cluster related topics and easily launch to them, they work well with the way my brain tends to explode outwards into related topics in a way that relies a lot less on footnotes and parenthetical comments, and is probably easier to follow as a result!</div><div><br /></div><div>Mature teams have good documentation. At risk teams use scattered "institutional knowledge" stored within the crania of particular people. </div><div><br /></div><h3 style="text-align: left;">Mind maps</h3><div>In the 1990s, there was a big fad for this guy called <a href="https://www.tonybuzan.com/">Tony Buzan</a> and the concept of "<a href="https://en.wikipedia.org/wiki/Mind_map">mind maps</a>" (although there were things like this much earlier, and even Buzan started promoting this from the early 1970s). One of its big claims is that it's supposed to work a little more like our brains do in sorting, linking and retrieving memory. Not totally outlandish, and you may find the approach useful in unpacking ideas or making graphical notes on topics. It can be a good way of exploring how you link up various concepts. Another tool for you to read about further and try out!</div><div><br /></div><h3 style="text-align: left;">How does <this> relate to <that>? </h3><div>A lot of IT is inter-related - it pays to be aware of the broader context and inter-relationship of all the bits involved. Obviously, there is a lot of content to it; the more you're aware of, the richer your overall understanding can be. It can be quite useful to note these inter-relationships as a further structure for your thoughts or to highlight areas you don't know, or are fuzzy. Those inter-linkages will serve you well in two ways; firstly, it helps get the knowledge lodged in your brain (linkages help memory!). Secondly, it helps when you're troubleshooting, because you have a pre-compiled map of how various things fit together that you can use to more quickly figure out what bit of the infrastructure stack has broken. </div><div><br /></div><div>Also, note how much easier it is to link another instance of an object type to a pre-existing category. My classic example is it's easy to add a new entry to the list of faces called "Dave", but it's very much harder to onboard a new name, perhaps Tarquin. But once you have, subsequent Tarquins become easy to add to the category "Tarquin" (they're symbolic links, in a way!). It may take several meetings before I'll get a new name down (and it's <i>much</i> easier if I see it written down). The stranger the name, the harder it is to remember. See, again, "linking" helps. </div><div><br /></div><div>I like drawing diagrams of how things fit together, and annotating them. This might be a habit from doing lots of network diagrams, or doing lots of biology - where annotated diagrams were part of the learning and assessment material. Again, it expresses information in another slightly different format to just words on a page, and does so whilst explicitly making the relationships and linkages clear. </div><div>Protip: Annotated diagrams are phenomenal tools when troubleshooting problems; use a hard-backed A4 size notebook for all your work notes - a habit instilled in my biology lab days that has served me well. Redrawing a diagram is better than photocopying or printing or simply just looking at someone else's diagram. You'll remember the Visio network diagram you drew from scratch way better than the one you printed out from documentation; you'll remember the network you built better than the one you inherited. </div><div><br /></div><div>Time spent noticing links and interactions makes seeing them or investigating their potential impacts in future much easier. Wise people have said that if you can't sketch a conceptual diagram of your network on the back of a napkin in a bar, it's too complicated!</div><div> </div><h3 style="text-align: left;">Mnemonic devices and songs / movements / stories</h3><div>Have you been to a preschool lately? They are weird spaces, doing learning in weird ways (compared to the staid classroom of later life). Yet those actions are really effective on our brains - you probably remember them really well. (<i>Head, shoulders, knees and toes...</i>). </div><div>Recite the alphabet. Does it have a melody attached? </div><div>Counting to 12 inevitably gets accompanied by a 1970s era funk backing track. </div><div><a href="https://www.youtube.com/watch?v=Hcx44e2gnfI">There's a pinball machine</a> involved. </div><div>1,2,3,4,5</div><div>6,7,8,9,10</div><div>11</div><div>12. </div><div>Thanks, Sesame Street. </div><div><br /></div><div>It might be worth thinking up some ways to harness those modes - repetition, song/melody, specific movements - to concepts. You might feel like an idiot, but hey, if it works...! And if you come up with an Expert level curriculum topic aid for IT concepts as a preschool level song and dance, <i>please</i> let me see it. These are excellent examples of how your brain links these multi-sensory experiences into rich, durable memories. </div><div><br /></div><div>What else do we have in preschool? Story time! Yay! Humans LOVE narrative. In one of the Science of Discworld books (I think number 2), the authors make the point that we should be called "<i>Pan narrans</i>" - the storytelling chimp (rather than the rather grandiose "<i>Homo sapiens</i>" - wise man). I like that, and I think there is some truth to it (even though the rules of taxonomy would make us all <i>Homo</i>, because <i>Homo sapiens</i> was named before any of the <i>Pan</i> chimpanzees, so it is the senior synonym, and thus takes precedence. Why, yes, I did taxonomy and systematics as a former profession). </div><div>They also introduce a fictitious substance "<a href="https://wiki.lspace.org/mediawiki/Narrativium">narrativium</a>" - a mysterious aether that suffuses stories, and, where a story needs a thing to be, brings it into being. Stories are much more memorable than simple brute fact. A good story goes a lot further in the human mind than a good fact (witness the spread of urban legend and real "fake news"). A good story has legs, and will run off into the sunset, possibly with your wallet. Stories are how we first built meaning, and how many cultures have conveyed history and knowledge through time immemorial - longer, certainly, that we've been writing things down, let alone formally learning things. Techies love a good tech story. <a href="https://www.cs.utah.edu/~elb/folklore/magic.html">Magic/more magic</a>, anyone? The <a href="http://web.mit.edu/jemorris/humor/500-miles">500 mile email outage</a>? Making good tech stories makes it easy to remember. Good tech stories also make <i>great</i> presentations, and good ways to bring facts together into a memorable whole.</div><div>Great teachers and memorable lessons are all too often really good storytellers - and stories. </div><div>A list of facts is instantly forgettable. </div><div>Weave them into a narrative structure, and people might just carry them forever. </div><div><br /></div><div>Mnemonic devices are, in a way, tiny stories. <i>Please Do Not Throw Sausage Pizza Away</i> is a useful way to remember the order of the seven OSI network model layers, and hey, I love some pizza, so it's memorable, clearly much more so than the one about <i>All somethingP somethingS somethingT Need Data Processing</i> that lays them out the other way around. I know someone called "Nita", and that makes remembering the four TCP/IP model layers quite easy. NITA. (some people prefer LITA). But it was an odd name and took me a while to remember it!</div><div><br /></div><div>If you don't get presented with useful recall/learning enhancing devices like these, it can be worth <a href="https://ciscohite.wordpress.com/2013/04/26/mnemonics-for-networking/">making your own up</a>. The process of thinking it up even helps to make it more memorable!</div><div><br /></div><div>You know what else is weird about preschools? They are free to learn through play. "<a href="https://en.wikipedia.org/wiki/Kindergarten">Kindergarten</a>" is German for "child garden" - it's designed to be a space to learn by playing and experiential learning. <a href="https://en.wikipedia.org/wiki/Montessori_education">Montessori</a> takes a somewhat more structured approach to semi-guided play-based education, but with a lot of self-determination and very centered around <a href="https://en.wikipedia.org/wiki/Constructivism_(philosophy_of_education)">constructivism</a>. </div><div>Quick pedagogical aside: <a href="https://el.media.mit.edu/logo-foundation/what_is_logo/logo_and_learning.html">Logo</a> was created as a <a href="https://en.wikipedia.org/wiki/Constructionism_(learning_theory)">constructionist</a> (an extention to constructivism) approach to getting children into computer programming (<a href="http://scratched.gse.harvard.edu/stories/beyond-programming-scratch-constructivist-learning-environment.html">Scratch</a> is a descendant). There's a lot of current work around bringing more of this learning approach into schools, and it is very promising, and done in the right way is very effective for inculcating "<a href="https://en.wikipedia.org/wiki/21st_century_skills">21st century skills</a>". If you want to learn more about that, check out <a href="https://inventtolearn.com/">Invent to Learn</a>.</div><div><br /></div><div>You know where adults learn IT by playing? Labs! (Maybe in Dev. Never in Prod. OK, fine you'll learn a LOT when Prod breaks, but goodness, that will involve some sweating...). </div><div><br /></div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><img alt="white ethernet switch" height="341" src="https://images.unsplash.com/photo-1551703599-6b3e8379aa8c?ixlib=rb-1.2.1&ixid=eyJhcHBfaWQiOjEyMDd9&auto=format&fit=crop&w=1000&q=80" style="margin-left: auto; margin-right: auto;" width="512" /></td></tr><tr><td class="tr-caption" style="text-align: center;">Find a lab to play in...<br />By @thomasjsn <a href="https://unsplash.com/photos/qTEj-KMMq_Q">https://unsplash.com/photos/qTEj-KMMq_Q</a></td></tr></tbody></table><div><br />What do you vividly remember about classrooms later in life? The time when something blew up in chemistry lab (which is a lot if your teacher is <a href="https://en.wikipedia.org/wiki/Andrew_Szydlo">this outstanding educator</a>), or the time when the physics lab bench caught fire because the teacher used the wrong gauge wire connected up to a car battery and panicked until a lab tech sorted them out with a fire extinguisher... Notice that not only was such an experience multi-sensory and steeped in personal physical experience, but it's often a story, too. Sure, repeated enough times, you'll recall that "Mitochondria are the powerhouse of the cell", but your vivid memories will be actual experiences, not rote learned facts.</div><div>Oh, another kind of lab! Hmm, it's almost as if applying theory to practice and <i>doing</i> knowledge things with tangible results in our own hands and in multi-sensory ways is good at helping us learn... See the "Practice" section above if you didn't earlier. </div><div>Rote learning? No thanks - but if you repeat it enough times, it works too (repetition!). </div><div><br /></div><h3 style="text-align: left;">Chunking</h3><div>Your mind likes to get information in familiar "bite sized" chunks. Unsurprisingly, this is called "<a href="https://www.verywellmind.com/chunking-how-can-this-technique-improve-your-memory-2794969">chunking</a>". You will often find that organising things into these particular formats makes them easier to remember, either short or long term. </div><div><br /></div><div>A good example is telephone numbers. Many countries chunk ten digit numbers in patterns like 3-3-4. In those countries, you will find it MUCH easier if you follow this pattern (perhaps assuming you ever lived in the "learn phone numbers" era). For example 5554644550 is much harder to temporarily store (for instance to go from a directory to typing it in) than 555 464 4550. Francophone countries chunk phone numbers in groups of two digits, but I suspect that's because their number system is madness. 98 is literally "four twenty eighteen". This can cause you significant grief when your grasp of French is tenuous, you forget about the chunking, and neither of you is conversing in your first language. (I had this experience in Madagascar in 2003). Similarly, you probably have a PIN buffer that is 4 digits long. Then you move to a country where PIN digits are 5 digits long. Then people invent six digit OTP, and at first, that is hard, until you develop a six digit buffer pattern storage area. </div><div><br /></div><div>Similarly, IP addresses are chunked and transformed. Remembering 32 binary digits is a pain in the ass. Remembering dotted quads? If you've been in IT for any length of time, it's easy. Now you need to work on your IPv6 buffer. ;) (No, don't, that is what DNS is for, aside from perhaps the prefix that covers your network). </div><div><br /></div><div>Extend these principles to other bits of information you can "chunk"!</div><div><br /></div><h3 style="text-align: left;">Mind Palaces</h3><div>Some people have success with linking concepts to a mental image of a place or location - Sherlock Holmes' <a href="https://www.smithsonianmag.com/arts-culture/secrets-sherlocks-mind-palace-180949567/">mind palace</a> is a fictional example of the <a href="https://en.wikipedia.org/wiki/Method_of_loci">Method of Loci</a>. There was a time before widespread print (or google) and people needed to work out methods to get their brains to store more information than we typically do today and had fewer options to go back and review information (particularly before the printing press). This may be something you can get some mileage out of for storing information, but beware, it is likely to be one of those avenues that takes a while to get any mileage out of. Note this subverts several bits of your memory apparatus, particularly linking knowledge memories to other things you already know.</div><div><br /></div><h3 style="text-align: left;">Beware of humour?</h3><div>A long time ago, I read somewhere about how humour suppresses the formation of memory - in other words: Why is it so hard to remember really hilarious jokes? There is some research around the topic, of course. Most of it notes specifically that it is the unexpected <i>twist</i> in the most effective humour that makes it so hard to remember, because it subverts our mental predictive machinery. Humour is the result of the unexpected, but the rest of the brain machinery goes on the fritz as a side effect. My worry is that by using too much humour in teaching materials, this suppression of learning may go a little further than just making the jokes themselves hard to remember - and extend into the substantive learning content. Also, there is a risk that humour can fall flat or be distracting from the content if it is overdone or done poorly. Of course, enjoyable writing is always easier to get through. Chris Parkers's outstanding <a href="https://www.networkfuntimes.com/">NetworkFunTimes</a> aims to blend a love of (stand up) comedy and network engineering. I've not noticed the odd bit of amusement putting me off my learning game going through his content, so this may be paranoia! Others even note humour <a href="https://effectiviology.com/humor-effect/">can aid memory</a> - but I think that's more the "narrativium" story-as-memory aid (or meme, in the original Dawkinsian sense) effect than humour being the magic touch. See how humour works for you! </div><div><br /></div><h3 style="text-align: left;">Strength of stimulus / reward / punishment - high vs low "stakes" learning</h3><div>If you learn about things like operant conditioning and the effect of various pleasant and unpleasant stimuli on memory, you will find that particularly "memorable" things are paired with particularly "effective" rewards (or punishments). If you consider your own experiences, you can probably remember particularly unpleasant events - as well as some particularly great or pleasant events and experiences, and your memory around this will often be extremely clear/vivid - often undesirably so for unpleasant or disturbing aspects of your past. Middle of the road mundane "non-events"? Typically pretty fuzzy, or entirely gone. </div><div><br /></div><div>You can possibly use this knowledge to apply to turbo-charging your own learning. I vividly remember lessons that were particularly awesome - but I also learnt fast in classes where the teacher was terrifying (however, I don't remember as much of the content from the terrifying classes as I do from the pleasant ones - but that may have more to do with the subject not being "interesting" to me than a side effect of the teaching style). A quick search around will show you that fear is typically negatively associated with good learning outcomes. Strong stimulus, and strong "stakes" prompt strong memory formation - this makes sense from an evolutionary standpoint; if it's high stakes, you're going to want to make sure the resulting (possibly hard earned) knowledge is going to stick around for later. </div><div><br /></div><div>This doesn't mean you need to get yourself a drill sergeant to push you (this is probably counter-productive), or get a sports car (or whatever) reward after you do well, but it takes little consideration to think "hey, I've noticed learning in these ways really works for me" and "doing this before or after learning really helps" - and do more of that. </div><div><br /></div><div>A fairly low cost way of having some kind of stake riding on the process is to be <i>accountable</i> to someone else for your learning - perhaps a "study buddy", partner, colleague or friend - with whom you can share your progress and demonstrate how you are (or are not!) progressing along your learning timeline. Accountability and deadlines are closely related - setting a defined date on which you are going to take a particular exam can also help spur you along on your journey - otherwise you may find competing priorities end up in infinite deferral! </div><div><br /></div><h3 style="text-align: left;">Project/Problem-based learning</h3><div>In the section on mnemonic devices and related handy memory scaffolding, I briefly mentioned constructivism. Here, I'll focus attention on it again, because from a significant amount of reading I did a number of years ago (when I was a school sysadmin trying to better understand what good education might be, and how tech might fit into that) I came across this paradigm, and it is powerful and extremely amenable to enriching technical content in particular. I will again plug <a href="https://inventtolearn.com/">Invent to Learn</a> - I think it is an important book, and that you will find it interesting and applicable to how you engage with young people, mentor or teach, or even discuss learning with teachers. I think you'll also immediately see how you can apply it to how you would build tech skills and understanding yourself. If you're trying to hack your own learning, it helps to understand some theory about teaching, learning and education!</div><div><br /></div><div>There's at least a whole book length worth of reading for you to do about this, but this style of learning is powerful in part because it's self-directed, and also, because <b>you</b> set the target and make your own "meaning". You don't <i>have</i> to do the reading, but trust me, you'll learn a lot better if you set up a technology you're trying to learn and tweak things to see what happens.</div><div><br /></div><div>One immediately applicable concept is that you're going to learn best by having a concrete project to achieve - not something like "finish reading the textbook", or "get that certification" - but "build something"! </div><div>Learning about networks? Build one! </div><div>Learning about RAID? Build RAID arrays! See the failure modes! Be amazed (if your array supports hot plug...) that you can pull a drive out and it keeps going! </div><div>Learning about hypervisors? Install one! </div><div>Play with the toys. Learn what they're doing. Click all the things. Type all the CLI commands. </div><div>Better still, have a complex project in mind to build. </div><div>Experiment. </div><div>Go wild. </div><div>Have fun, even. </div><div>To the lab with you!</div><div><br /></div><div>This is also why you learn a hell of a lot from solving real world problems with your infrastructure - it's undercover project based learning, and it's probably "high stakes" to really get those neurons firing!</div><div><br /></div><div>Other people's topologies and lab exercises are always a little less effective than those that have real meaning to you. If you need a prompt, imagine rebuilding some part of your business infrastructure - or a "better replacement" based on the tech you're learning, and design and implement around that. </div><div><br /></div><h2 style="text-align: left;">Teaching</h2><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><img alt="two men watching on silver MacBook" height="342" src="https://images.unsplash.com/photo-1555436169-20e93ea9a7ff?ixlib=rb-1.2.1&ixid=eyJhcHBfaWQiOjEyMDd9&auto=format&fit=crop&w=1000&q=80" style="margin-left: auto; margin-right: auto;" width="512" /></td></tr><tr><td class="tr-caption" style="text-align: center;">Teach someone to cement your own knowledge!<br />By @josealjovin https://unsplash.com/photos/JZMdGltAHMo</td></tr></tbody></table><div><br /></div><div>I don't think that people that "just get" a topic necessarily make very good teachers. I think those that struggle to do so are ultimately more effective teachers of a topic, both because they can empathise with those grappling with the topic for the first time, and because they probably then have several different ways of mentally approaching that topic, which can help you to bring a topic to life in meaningful ways for everyone in that class. Have you ever sat in a classroom with a teacher that looks at you with the "but how can you <i>not</i> just see this thing" expression, exasperated as to how they can possibly explain it anyway other than "it just is that way and you must learn it"? (I was sometimes that kid in Maths class, but teacher's pet in others - and I've met way too many Maths teachers in particular who can't dynamically approach a teaching problem!). The problem with maths is it's too absolute and un-fuzzy, so it's hard for people who are good at it to approach it from any other angle. Similarly, if you've only ever interacted with very intelligent or (over)educated people, it can be a rude awakening to sit in a classroom with "normal" or "below normal" people - which is going to be most of the working world, and customers - so get used to it, and learn how to relate. Teachers love the top set classes - but the true test of pedagogical mastery is the bottom set! Teaching the top set is fun. Teaching the bottom set is an achievement. You've really got it when you can teach anyone. Learning is good. Teaching is better. </div><div><br /></div><div>I <i>really</i> understood subnetting after teaching it to high school kids. </div><div>I learnt a lot about defense in depth when I... taught it to high school kids. </div><div>I cemented my elementary networking knowledge when I literally wrote a book about it... for high school kids. </div><div>I went on to further amplify all of these things mentoring adult team members too, but the crucible of getting this right was with teenagers - the most terrifying of all audiences (outside of a maxsec prison, anyway). </div><div><br /></div><div>For these reasons, I think that teaching and mentorship are excellent ways of cementing knowledge. Firstly, if you're doing it right, you're going to scaffold a way of introducing technical topics in a way that builds up from a solid foundation, and helps people link topics together in tangible ways. </div><div>Secondly, you're going to make damn sure you know the topic before standing in front of other people. </div><div>Thirdly, implicit in all of this is a stage of repetition, and often, of reframing knowledge, both of which revitalise your mental machinery - stomping down old pathways, and forging new links and ways of seeing things. </div><div>Fourthly, when you come across people not "getting" it, you're going to be forced to approach topics from new angles and unique perspectives that you alone cannot possibly dream up. </div><div>Fifthly, you're going to work out "hooks" to keep it interesting and people engaged - and that works on you just as much as them. </div><div>And yes, doing this well takes a lot of time. </div><div>As an added bonus, you improve other people, which is good and worthy in and of itself. </div><div><br /></div><h3 style="text-align: left;">What are some effective study methods from this "teaching" perspective? </h3><div><ul style="text-align: left;"><li>Well, the obvious route is to "teach" people you lead and mentor. </li><li>Presenting tech talks is a thinly veiled teaching exercise!</li><li>Once you get advanced enough, you may actually want to try more formal teaching of those subjects in appropriate places - either like school, or hopefully, more like university lectures. </li><li>You can create study groups in which you take turns "teaching" each other topics - it's pretty easy to find groups of people at similar stages in their journey to you online - or you can mentor people earlier in the journey, but be careful to overstate your expertise or unintentionally mislead or misinform. This works quite well in you're in larger teams and you get the more junior members to work on this stuff, with a more experienced guide to make sure things don't go too far off the rails.</li><li>You can extend this study group concept into something similar to an academic "paper club" or "reading group" - take an RFC, presentation, white paper, current topic or some other useful bit of knowledge, present it ahead of time, and open up discussion within your peer group. This is obviously easier where members of the team have met this format before - it's daunting when they haven't. </li><li>You can write informative articles on the topic, although the feedback loop here is poor compared with a "live studio audience" of faces staring at you! </li><li>I've even heard of people teaching technical topics to their pets - but that is a slightly less useful, if at times rather enthusiastic, audience. </li></ul></div><div>All of these are like teaching, and all of them are worthwhile adding to your basket of tricks. The more like formal teaching it is, the more valuable I think it is - both as an activity in its own right, and in improving your own mastery. Building up a syllabus or curriculum and delivering units of knowledge, practical experience and in some way assessing formative and summative knowledge are all useful. </div><div>By all means make use of things like "<a href="https://en.wikipedia.org/wiki/Flipped_classroom">flipped classroom</a>" approaches, or treat your direct contact sessions more like lectures (introductory guidance meant to stimulate independent further research, thought and discussion outside of class) rather than school lessons (where the content delivered is the content expected to be regurgitated). Or even like <a href="https://en.wikipedia.org/wiki/Tutorial_system">Oxbridge tutorials</a>, which tend to very much be about group dialogue, co-discovery and shared construction of meaning through discussion, debate and the meeting of independent research by several people. Doing so may help create a culture of enhanced information literacy and independent, critical thought, and that is a good thing! Many of my teachers liked answering questions with more questions - initially frustrating, this caused you to switch your brain into further thought, and independent research. Thanks, "difficult" teachers! </div><div><br /></div><div>Finally, remember that a lot of IT support is actually "just in time" teaching - providing exactly the right prompt to a customer, end user, or colleague to solve a particular problem they are having. Try to resist "doing it for them", unless there is some sound reason not to do that. </div><div><br /></div><h1 style="text-align: left;">Reflect on your own "learning styles"</h1><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><img alt="bamboo raft" height="427" src="https://images.unsplash.com/photo-1536585806558-81c7ea4d393d?ixlib=rb-1.2.1&ixid=eyJhcHBfaWQiOjEyMDd9&auto=format&fit=crop&w=1000&q=80" style="margin-left: auto; margin-right: auto;" width="640" /></td></tr><tr><td class="tr-caption" style="text-align: center;">Reflect on your learning!<br />By @joshuaearle https://unsplash.com/photos/EqztQX9btrE</td></tr></tbody></table><div><br /></div><div>As I mentioned earlier, there are schools of thought that different people learn better in different ways - in terms of ways of supplying info to your brain. This can also be extended into considering some of the ways of enhancing whatever delivery method you choose as covered above. Try to critically reflect on which methods actually work best for you. </div><div><br /></div><div>I find it hard to believe there are those who learn technical theoretical content best from YouTube videos. Sure, it's great for practical skills that have intricate motions (perhaps uncharitably: monkey see, monkey do!) - but most technical IT content is not like that (with the obvious exceptions like terminating and dressing cables). I suspect a lot of people like it because it is more entertaining (to them) rather than educating. "Edutainment", if you like. How much of that added entertainment enriches the content, and how much detracts? Are you getting more education, or more entertainment out of it? I also really dislike podcasts. I am all about books and blog posts - written works; static, labelled pictures you can pore over and consider at you own pace. You might be very different from me and vehemently disagree with my assessment of this - and that is fine! But do spend a little time trying to find some "objective truth" about how well you learn through different methods or media and stick with the ones that work best for you - if it's not available in your preferred medium, make it so, and share that content, if appropriate. Not how easy you find to gloss over the content and how "pleasant" the process is, but how well it sticks (months later). I am also cognisant that reading has been my dominant information onboarding method, so, maybe I'm biased, but I don't think so - it really is superior. </div><div><br /></div><div>I also know I have developed an ability to "cram" a lot of knowledge into a temporary mental buffer space. Some of it sticks, some of it does not - it's been a great way of getting more points in exams throughout my life, but in some ways, it is cheating yourself in the long term! As I know I have this ability, I have a choice to use it - or not. One of those decisions is harmful in the long term if I don't then make it more "honest" by cementing that short term gain in various ways.</div><div><br /></div><div>What are your learning goals? As a professional, it's <i>not</i> displaying you can pass the certification exam - it's displaying you deploy that level of skill, insight and knowledge day-to-day in the real world. The certification is a <i>validation</i> of your learning, not the target! Don't cram learn (and definitely don't use "dumps"). Don't seek the minimum required content (is this going to be on the test?) - you're trying to be a professional, not just get grades with the minimum effort possible. Realise you might (you, in fact, absolutely should) circle back to "old" topics to refresh your understanding, and build it to a deeper (or higher?) level, and link it to more topics and greater understanding. I guess this is why the advice on good CVs is to write about achievements, not simply list responsibilities and qualifications - show how you apply this stuff in the real world. The <a href="https://www.usenix.org/system-administrators-code-ethics">sysadmin code of ethics</a> specifically mentions education - "<i>I will continue to update and enhance my technical knowledge and other work-related skills. I will share my knowledge and experience with others.</i>" . </div><div><br /></div><div><br /></div><h1 style="text-align: left;">Campaign for Change!</h1><div>So the treadmill is never going to slow down, and you can't get off. What can you do to improve your quality of life, other than optimising training for yourself in your "free" time? </div><div><br /></div><div>Well, the more people in the industry make training and development a "benefit" they seek, or transform regular training into something that makes obvious business sense, the more likely we are to have this stuff mainstreamed into our working lives, freeing up free time for more enjoyable pursuits (hey, no judgement, labbing is fun too, but you <i>need</i> to get out the house and spend time with loved ones too - wear a mask). </div><div><br /></div><div>Imagine: Supportive managers who campaign to get us that training, or some on the job free hours, or make kit available for lab-ing. Who note the need for paid study leave - and give it to us. A C-suite who recognises training their staff gives them a competitive advantage in both business as usual and recruitment. If you are a technical manager, you have some sway to get at least some of this done, either tacitly or through low grade guerrilla warfare!</div><div><br /></div><div>Make business cases - how much does it cost to hire an implementation consultant? How much does it cost to train staff to an equivalent level? (There are of course times when it's not a question of money, it's a question of time - training someone always takes longer than using a trained resource, which you will absolutely see as you start training/mentoring people yourself). Once trained (and retained, and sustained) you keep on having that resource - and may even be able to further monetize it. </div><div><br /></div><div>Around 15 years ago, I was in an organisation that was thinking about how we could build capacity across various technical (scientific) fields in lower and middle income countries (across the eastern coastline of Africa and its Indian Ocean islands). </div><div><br /></div><div>It came down to 4 key things: </div><div><br /></div><div><b>Attract</b>: You need to get people into your organisation and field. </div><div><b><br /></b></div><div><b>Train</b>: You need to equip them with the right skills. </div><div><b><br /></b></div><div><b>Retain</b>: You need to make sure they don't switch out careers (to another field or another organisation) - long is the list of qualified PhDs who end up working for banks or other choice employers in the developing world instead of research or conservation - or move to other countries to use their skills. Money talks - but so does professional satisfaction, above a certain level of meeting <a href="https://en.wikipedia.org/wiki/Maslow%27s_hierarchy_of_needs">Maslow's hierarchy</a>. Think about how this relates to your own sentiments about where you work, and should you lead or mentor others, how this might influence them. </div><div><b><br /></b></div><div><b>Sustain</b>: You need to keep them happy, and keep challenging work coming and ensure the resources needed to support that work are available - and develop healthy institutions to keep the ball rolling, long term. It is depressing going overseas to a prestigious PhD programme, and return home to dysfunctional nothing - aside from low pay, this lack of career prospects and support was the leading cause of hemorrhaging of talented and trained people. Science, much like tech, often needs shiny and expensive toys to progress (physicists and astronomers take this to particularly impressive heights). How can you sustain skilled professionals in you field? What do you need to feel "sustained and supported"? What do your team members need? </div><div><br /></div><div>This clearly required a highly integrated and holistic (if you'll forgive the phrase) approach. It helped to improve entire education systems, to grant scholarships, and to provide career-long access to research platforms, facilities, funding and opportunities. It was bold. It was visionary. It was very difficult to sell to politicians. However, it will be easier to do this at a rather smaller scale - your own organisation. How can you achieve those 4 key pillars in your own efforts to strengthen your team? </div><div><br /></div><h1 style="text-align: left;">Learning <i>is</i> Work. </h1><div>To some extent, I think our professional attitude and personal thirst for knowledge may do us a disservice, in that workplaces may take advantage of this, pushing more and more of this out of work hours and into time we should be resting, recharging and relating to other people. </div><div><br /></div><div>In the same way that "<a href="https://noidea.dog/glue">glue</a>" is work, learning is work, so work should support it! There's been a huge trend in modern workplaces to expect staff to arrive with, or self-gain knowledge, skills and experience; I recall my dad being sent off on training for all sorts of things which I can't see much evidence of happening in the modern workplace. I can't recall ever being sent off for technical training - and the only "training" I've had has been around workplace rules or safety, or basic "how to use this product". So I think we need to make sure that it is understood that time we take outside of the work day to do this is a luxury or added bonus, not an expectation. We need to make sure this message is heard, loud and clear. </div><div><br /></div><div>Balanced against that - there are limits to what you can expect, particularly if what you want to learn isn't quite in that company's interest. Mostly, unfortunately, they want you to stay put where you are. If you want to progress, you may having to support your own development. But if they expect you to gain additional skills, or deploy new technology, part of the overall project should be educating the workforce involved in those changes. It always used to be in other industries!</div><div><br /></div><div>So... get learning recognised as work!</div><div><ul style="text-align: left;"><li>Discuss this during performance management processes. </li><li>Reframe the conversation: <i>Learning </i><b>IS</b><i> Work</i>. </li><li>Learning and practice <a href="https://www.forbes.com/sites/adigaskell/2017/07/20/do-we-need-a-20-time-for-learning-at-work/">lead to innovation</a>.</li><li>Request a library of technical books.</li><li>Put it in the "suggestions" box. </li><li>Bring this up in interviews. </li><li>Write about it. </li><li>Talk about it. </li><li>Demand it. </li></ul></div><div>It is all too common to have a "professional development" section in performance management which is entirely ignored - this is a short-sighted practice, and we should as stringently demand progress and protected time for this activity as we do for any other KPI. This should arguably be to the point that you could excel in every other area of the review and its targets, and yet still not even achieve "satisfactory" on the overall review if you didn't meet the learning goals. People achieve what they are rewarded to achieve; they do what is demonstrated to be the expectation - they do what you measure! </div><div><br /></div><div>To some extent, learning-as-work will help with workplace diversity and inclusion. If work develops people and allows a healthy work-life balance, it makes careers accessible to people from lower income groups; it allows more scope for families. This simple action - treating learning and staff development as important <i>work</i> and learning outcomes as <i>work product</i> - should help foster diversity and gender parity in the workplace - and keep your tech workforce up-to-date and current with the interminable gale force winds of change. </div><div><br /></div><h1 style="text-align: left;">How Long? </h1><div>Many people want some measure of how long learning anything is going to take. The answer is "somewhere between 20 and 10,000 hours". After <a href="https://www.youtube.com/watch?v=5MgBikgcWnY&feature=youtu.be&t=520">Josh Kaufman's 20 hours, spent the right ways</a>, you will be, I suspect, right at the <a href="https://schoolsysadmin.blogspot.com/2020/07/dunning-kruger-and-learning.html">summit of mount stupid</a> - but, and here's the important bit - you have enough knowledge/skill to really get into practicing and experimenting, and have a foundation to launch further investigation from. Twenty hours isn't a lot of time (certainly not when considering ten thousand!), and you're by no means going to be an expert in anything (that's more 10,000 hours territory) - BUT you'll have enough to really get stuck into a topic or lab things, and start applying that in the real world. </div><div><br /></div><div>One key point that is made is that for practical skills, in particular (programming, CLI commands, building things, troubleshooting, etc.), <i>reading </i>about them isn't as helpful to learning them as actively <i>doing</i> them is, at least once you've grasped the "fundamental" basics. Much of IT is skills that have some sort of a knowledge component to them - you need to know what a command might be, but you need the skill to use those commands in the right places and at the right times, and to see and experience how they behave and effect the theoretical knowledge in the real world. </div><div><br /></div><div>The <a href="https://youtu.be/5MgBikgcWnY?t=573">4 key points in that video</a> might also form a useful scaffolding for your own initial learning and progress, but as always, do what works for you - experiment a little with your learning and find what really works for you. This work is often done for you, particularly step 1; the exam topics/syllabus/curriculum are basically the broken out key skills and knowledge you need to get. I'll reiterate that most of IT is actually practical and skill-based, and less academic and cerebral - look at the way most people code - they literally hack it. They throw ideas at a problem and see what sticks. When it's good enough, they move on. Similarly to learning the 4 chords from the "axis of awesome" lets you blag your way to seeming like you're vaguely reasonable at music, there are many similar tools that will equip you to go quite far in IT; basic building blocks you cobble together in various <i>ad hoc</i> recipes to solve the challenges you meet. </div><div><br /></div><div>Learning more arcane theory (Big O notation, or CAP theorem, more obscure algorithms and functions, for example) can be useful further down the line (and it is useful to know that stuff exists and you don't yet know it). At the end of the day, though, you probably want to go beyond "basic apparent competence" and move some distance towards the "mastery" end of the scale - and that <b>is</b> definitely going to take more time. It is often worth curating a list of things you want to learn more about later on that is procrastination if you do it now. Put it at the back of your hardback notebook, or in a "stuff to learn" text file. Start with the basics!</div><div><br /></div><div>Commit 20 hours to learning something (not an entire professional field, but some important or intriguing part of it), with most of that time spent playing with the actual tools, and see how far you get - likely, far enough that your pleasure in getting somewhere is a useful ego-boost that will spur you on to do more - and you'll have a usable building-block you can reuse and recycle. And then wend your way down the 10,000+ hour path of expertise/mastery. </div><div><br /></div><div>Kaufman's <a href="https://youtu.be/5MgBikgcWnY?t=573">4 key 20 hour skill-building steps</a>, after you've decided what you want to learn, are: </div><div><ol style="text-align: left;"><li>Deconstruct the skill</li><ol><li>Break apart a career skill-set into components. Each of those is a skill. Each protocol is (at least) a skill. </li><li>Yeah, that might be quite a lot of 20 hour chunks of time for a career. You can eat an elephant - one meal at a time!</li></ol><li>Learn enough to self-correct</li><ol><li>You can go overboard with acquiring resources to learn from, and this can be a form of procrastination.<br /></li><li>Most resources will have practice examples. DO THEM! Do <i>not</i> just read about them or watch the video. </li><li>You need enough understanding to get going - and therefore notice things that are mistakes or errors. This is particularly easy with something that is a straight physical skill (playing a chord incorrectly gives you instant feedback; in tech, run the command sooner and check you get what you expect - and nothing else). </li></ol><li>Remove practice barriers</li><ol><li>Set up a lab!</li><li>Use the lab!</li></ol><li>Set aside time (Practice at least 20 hours)</li><ol><li>Calendar set times and guard them jealously. </li><li>Swap out (within reason) "empty" activities for enriching yourself. </li><ol><li>Doomscrolling is fun for a while, and you get the occasional nugget, but you'd have learnt more by doing for that hour. </li></ol><li>More concentrated learning periods tends to work better - scattered bits of learning work less well, because if they are short and infrequent, you spend a lot of each session "getting back to where you stopped". 20 hours in 5 minute chunks is rather different in effect to 20 hours in 2 hour (or perhaps more) chunks - remembering to take the odd break!</li></ol></ol></div><div>Bear in mind Kaufman isn't looking for "expert" mastery, or to develop a career - just competence enough for personal pleasure. But I think it's important that you see you have some hope of learning enough to start doing or understanding even very complex things in very attainable stretches of time.</div><div><br /></div><div>You'll probably recall, if you've read at all about DevOps or "Lean" manufacturing, the fundamental importance of rapid feedback loops to optimised and efficient processes. The same is true of education; ongoing quick/immediate feedback is <b>great</b> for learning. Delayed feedback is not (that essay you get comments on 2 months after submission? Useless!). Contrast formative and summative assessment. You rarely learn anything from summative assessments (other than where you lack knowledge - think exam); formative assessment can actually help you learn (end of chapter quizzes; more advanced learning platforms that give you immediate feedback of various sorts; pop quizzes; rapidly returned work; commands that immediately give you verbose errors; tail-ing logs). </div><div><br /></div><div>IT is quite good at instant feedback, in many instances: things work or they don't - use this to the benefit of your learning. "Book learning" has delayed feedback loops (it only gets battle tested some time later). Move to knowledge implementation, rather than knowledge acquisition, as soon as you can. We are actually quite bad at doing summative assessment in near-realtime in formal education - there is an area some startup will eventually make a shedload of money (or by giving it away, perhaps change the world) by using machine learning or similar to do things like adaptive difficulty and deliver near instantaneous feedback to ongoing assessments - a personal 1:1 trainer is ALWAYS more effective than a lecturer teaching 1,000 students. The tighter the feedback loop, the more powerful it is (I started scrawling notes on this whilst reading DevOps books, because the parallels to education were obvious to me). That's also the point at which e-learning platforms will really take off and live up to the hype. </div><div><br /></div><div>I have consistently found "just in time" learning to be the most effective for individual skills. I have a problem, I need to figure it out, and I've never used this technology before. Read some things, test some ideas, work it out. That knowledge is really "sticky". </div><div><br /></div><div>But - a word of warning - too much Just In Time learning may lead you to a situation where you have a lot of "gaps" - if you've gone quite far without more formal curricula (i.e. only by "crisis" or "immediate project need" learning), you may benefit from taking a step back, perhaps even going "back to basics" and reviewing the possibly much larger systemic knowledge pool you'd gain from a more systematic learning programme, and filling up your patchy knowledge to a more level whole. But take the obvious parallel - the sooner you apply "book learning" to actual problems and gear (virtual or otherwise) the more quickly you will learn, the more embedded it will be in your mind, and, likely, the more flexible you'll be in applying it - and it will often help you to <i>understand</i> the technology and underlying reasoning, too. There is a big difference between being able to recite correct answers, and actually understanding what they mean, and why they are the correct answer.</div><div><br /></div><div><div>Continued practice, whether at your day job or in a lab of hypothetical scenarios refreshes that learning continuously. Remember how you "got" trigonometry at school decades ago, and there's something about SOHCAHTOA? Can you actually<i> still</i> do trig, or have you forgotten ? If you're a natural Maths Genius or regularly still do trig, ignore this, but realise ordinary mortals forget this stuff very quickly without regular repetition and practice. Much knowledge and skill is like this - regular practice and reinforcement is vital. </div><div><br /></div><div>I strongly suspect that changing from high stakes summative assessment at the end of several years of learning to regular summative assessment in short "modules" means people actually know less after a period of time. If you only have to learn a term's worth of information, you can kind of cram it into short-term memory and do quite well in an exam on it. Several years worth of information does not fit in there, so you have to have the larger knowledge area truly mastered. Importantly, short term memory decays fast - so there is quite a lot of what I knew that I knew I no longer really know, either because I crammed it, or because I've never needed or used it again, and brain processes have discarded it - or it was never even a candidate for long term storage in the first place. However, it definitely has "echoes" that allow me to reacquire working knowledge or rapidly find the method with a lot less effort. I can't remember the full details of pKa, but know it exists and the rough "shape" of the idea it expresses and its applications, and can therefore quickly find what I need to work out how to make a physiological buffer of a particular pH (and, no, I haven't needed to make a buffer in like 20 years). Of course, it could be argued that all I need to be able to do is google "pH 7.3 buffer solution" and the recipe is <a href="https://www.aatbio.com/resources/buffer-preparations-and-recipes/phosphate-buffer-ph-5-8-to-7-4">probably out there</a>. Similarly, I often have context- responsive knowledge - I need the visual reminder of the menu navigation options for a program function or the CLI prompt before I can walk someone else through a process (this is a sign of intermediate or infrequently refreshed knowledge, I suspect - you know it, but not enough to do it "in your sleep"). "<a href="https://en.wikipedia.org/wiki/I_know_it_when_I_see_it">I know it when I see it</a>" applies in a parallel sense as much to some forms of knowledge as its more infamous connotation. </div></div><div><br /></div><h1 style="text-align: left;">Gut Feel</h1><div>Finally, a word on <a href="https://en.wikipedia.org/wiki/Tacit_knowledge">tacit knowledge</a>, which you may have seen experts use all the time. This is the sort of "spooky" knowledge or "gut feel" that mastery seems to produce; you <i>know</i> what the problem or solution is, but you may have a really hard time verbalising <b>why</b> that is the case. The only way you acquire this is practice and deep familiarity with things. You can see it "looks wrong", or it "feels like category problem X", or "the answer is probably Y". Once you find you get these intuitions regularly - and they are mostly correct - well, congratulations, you're probably way down the road to expertise. This is the result of your brain's pattern-matching circuitry working on possibly thousands or millions of examples of things that were "right" and things that were "wrong", and an internally constructed model or simulation of those systems against which observations are run, and providing an answer for you - but you'd struggle to reduce that to a learning programme that was anything other than "spend 20 years doing this stuff, and you'll also get it". </div><div><br /></div><div>So go start getting that raw data stuck in your brain and learn by doing and in practice, and build those mental models! Some people, who "just get" certain domains of knowledge also find much of that entire field is effectively "tacit knowledge" - <i>it just </i><b>is</b> <i>that way </i>- and they will struggle to teach or mentor anyone who is not also like that. I think this is why I've often found maths teachers to be particularly bad (yeah, sorry, I am bringing this up again) - I don't naturally "just get" mathematical concepts; I have to <i>work</i> at them. Whereas some mathematicians, for whom it seems to me that it "just clicks", end up as teachers - and seem to struggle to verbalise - and therefore, to <i>teach</i> - that same content to others. The whole of maths, to them, is perhaps tacit knowledge. Many (but not all) people who display amazing mastery of things are (often) not great teachers or mentors, but they can certainly be inspirational - or, more likely - aspirational goals! Whether you are looking for someone to help you along, or are considering teaching, coaching or mentoring someone else, realise that teaching, coaching and mentoring themselves are skills that need practice, and be aware that those who "just get" a topic may find it harder to relate to those who don't. It is surprising how little effort we are expected to expend in learning how to master leading, managing, mentoring or teaching people in the workplace.</div><div><br /></div><div>If you do end up around people that know amazing things but can't really explain why or how, figure out how to leverage that anyway - just watching them work can help you understand things. Figure out how to do that unobtrusively and without them feeling like a bug under a microscope! Sometimes, you can actually coax answers out of them by asking the right questions (at the right time - in the middle of a P1 outage is almost certainly not the right time). You might also just need to write down a whole bunch of new words and concepts and google them later! Their war stories will probably help your brain add to the development of your own spooky pattern recognition circuitry, too. </div><div><br /></div><h1 style="text-align: left;">In Summary</h1><div>So this is, for a blog post, quite long and information-dense. Apparently, it helps to summarise the key points at the end, so...</div><div><ul style="text-align: left;"><li>Your brain is amazing at remembering things, but it protects itself by forgetting things, too.</li><li>You don't get much choice in this.</li><li>You can influence it primarily in two ways to make your learning more effective: </li><ul><li>Contextualisation and linking new knowledge/ideas to other existing memories, emotions and understanding.</li><li>Repetition is absolutely key.</li></ul><li>Use different expressions of knowledge:</li><ul><li>Figure out what modes of learning work best for you; try different sources (content creators) and styles (written, audio, video, etc.), because different presentation or different ways of seeing the same topic may help you build you own meaning.</li><li>Take handwritten notes.</li><li>Lab. <a href="https://twitter.com/search?q=%23LabEveryday&src=typed_query">#LabEveryday</a>. </li><li>Teaching is a form of practice, and because teaching often involves explaining things in several ways so people "get it", enhances your own mastery and detailed understanding of concepts - you also have to contextualise, clarify and distill the concepts in order to teach, which, again, is good for really knowing.</li><li>"Practical" learning is best - enact concepts you learn in physical and tangible ways.</li><ul><li>If you teach, use this!</li><li>essentially, this is <a href="https://en.wikipedia.org/wiki/Blended_learning">blended learning</a> - theoretical knowledge plus virtualised practice with immediate formative assessment and feedback.</li></ul></ul><li>Practice implementing knowledge as soon as you can after getting it; link theoretical concepts to practical implementation of them. If you have a fantastic or stimulating project idea, write it down in the moment and implement it as soon as you can. Do not start a new chapter, section or topic until you've done at least something with the last one that is "real". </li><ul><li>Shorten and enhance feedback loops in your learning. </li></ul><li>Use project-based learning - have a defined target or task in mind, figure out what you need to know to get there, get that knowledge, and practically apply that knowledge; <a href="https://inventtolearn.com/buy-the-book/">invent to learn</a>. Whilst gaining a cert or a job (or a job title) may be goals, they are not projects - build something as a project. </li><li>If you are in any position to influence culture in your organisation or team, try to get learning-as-work normalised and widely practiced. If nothing else, freeing up some space to just think and play often leads to breakthroughs, but it's also a powerful way of ensuring your organisation as a whole really becomes a "learning organisation". </li></ul><div><br /></div></div><h1 style="text-align: left;">Further reading</h1><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><img alt="blue wooden door surrounded by book covered wall" height="325" src="https://images.unsplash.com/photo-1484415063229-3d6335668531?ixlib=rb-1.2.1&ixid=eyJhcHBfaWQiOjEyMDd9&auto=format&fit=crop&w=1000&q=80" style="margin-left: auto; margin-right: auto;" width="512" /></td></tr><tr><td class="tr-caption" style="text-align: center;"><span style="text-align: left;">Need Moar Books!<br />By @eugi1492 https://unsplash.com/photos/6ywyo2qtaZ8<br /><br /><div style="text-align: left;"><br /></div></span></td></tr></tbody></table>This post was mainly inspired by thoughts triggered by: <div><div><ul style="text-align: left;"><li><a href="https://www.edutopia.org/article/why-students-forget-and-what-you-can-do-about-it">https://www.edutopia.org/article/why-students-forget-and-what-you-can-do-about-it</a></li></ul><div><div><br />Humour might hurt learning: </div></div><div style="text-align: left;"><ul style="text-align: left;"><li><a href="https://www.brainscape.com/blog/2012/10/why-we-dont-remember-jokes/">https://www.brainscape.com/blog/2012/10/why-we-dont-remember-jokes/</a></li></ul></div></div></div>James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com0tag:blogger.com,1999:blog-5949954951402579361.post-39703000080916238452020-07-05T16:03:00.012+01:002020-07-06T15:45:17.219+01:00Dunning-Kruger and LearningI can't imagine you've never heard of the Dunning-Kruger effect - it is a popular and well known model of actual expertise vs. perceived expertise, and your confidence in them. I think, as IT professionals, it is useful that we know it exists, particularly as we embrace new technology or extend our learning. <div><br /></div><div>So why is this a big deal...? </div><span><a name='more'></a></span><div><br /></div><div><br /></div><h1 style="text-align: left;">The topology of Wisdom and Confidence </h1><div>First of all, let's recap the central ideas of the <a href="https://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect">Dunning-Kruger effect</a>. Sadly, the original article from about 20 years ago is paywalled (<a href="https://doi.apa.org/doiLanding?doi=10.1037%2F0022-3514.77.6.1121">grr</a>), but there is plenty of information about it online. </div><div><br /></div><div>In essence, this effect tells us that, shortly after being total n00bs, we assume we confidently know a hell of a lot about <i>everything</i>. Many people reach that point, and never progress. This is the lofty peak of the so-called Mount Stupid. We do NOT want to build a comfy cabin atop Mount Stupid, and rest on our somewhat windswept laurels. </div><div><br /></div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhvPVjQWJFkaVdlO88WQp1NPsby6FqFmseHF6exYpWMvdpV3HLSiXs40BurtNcbih0552n2ima646VGfgbA5qm8p8GCpIHNJIsCg73PeopLBYMZ9Mw0puW6ngfOlLcOYx6ZxDxC1lAT6B0/s500/dunning-kruger-effect-detailed.jpg" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="367" data-original-width="500" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhvPVjQWJFkaVdlO88WQp1NPsby6FqFmseHF6exYpWMvdpV3HLSiXs40BurtNcbih0552n2ima646VGfgbA5qm8p8GCpIHNJIsCg73PeopLBYMZ9Mw0puW6ngfOlLcOYx6ZxDxC1lAT6B0/s320/dunning-kruger-effect-detailed.jpg" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The Dunning-Kruger effect<br />found at <a href="https://www.bcs.org/content-hub/the-uncomfortable-truth-about-agile/">https://www.bcs.org/content-hub/the-uncomfortable-truth-about-agile/</a></td></tr></tbody></table><div><br /></div><div>Unfortunately, after the comfortable lofty heights of Mount Stupid, we descend into the Valley of Despair, where we think we know nothing, will never get anywhere, and probably drown in an intellectual bog or get stuck in motivational quicksand. </div><div><br /></div><div>However, if we can push through that, we'll reach the Slope of Enlightenment, and, after a likely grueling climb, peek over the edge of the canyon across the Plateau of Sustainability - possibly into a glorious sunset and catching sight of a perfectly chilled beverage waiting for us. Our journey is not yet over, however!</div><div><br /></div><div>Takehome messages? First, beware of the times when you think you've mastered a topic. You might well be sitting atop Mount Stupid. Secondly, when you're mired in the Valley of Despair, eating the last of your Lembas Bread and covered in muck, know that if you keep pushing you will eventually climb. Thirdly, note that there is a plateau of sustainability - the learning cannot, and should not, stop; it keeps on going. You may also experience that there are many plateaus of sustainability and valleys of despair in your own path.</div><div><br /></div><div>For additional insight, realise that this is a landscape, not a linear path - these treacherous peaks spread out in every direction (and dimension) - and you can be at different places in this curve for any skill or knowledge. You might, for example, achieve a top level certification in one narrow discipline, and likely be somewhere on the plateau of sustainability for that, but possibly that sense of expertise may leak across into other areas - where you're only atop Mount Stupid (for example people who think management is easy after mastering technical skills may well be sitting firmly atop Mount Stupid as managers - because they have not learned enough about management to realise where they are). </div><div><br /></div><div>If you've spoken to people who are highly educated (for example many academics) who take stock of their knowledge, many of them will come to a conclusion that their learning mainly underscores how little they know - not that they have become omniscient cognoscenti. There are three major categories of knowledge - things you know (I know this stuff! "the known"), things you know you don't know (I have a knowledge gap here about this stuff! "the known unknown"), and things you don't know you don't know (I've never even heard of this stuff! "the unknown unknown"). For example, <a href="http://amandabauer.blogspot.com/2010/02/three-types-of-knowledge.html">take this blog post</a>, illustrating that what education seems to do is slightly increase the stuff you know, but really underscores what you know you don't know - but still hardly touches on things you don't know you don't know (how can it?)! </div><div><br /></div><div>The other side to Dunning-Kruger is <a href="https://en.wikipedia.org/wiki/Impostor_syndrome">impostor syndrome</a> (described more than twice as long ago as Dunning-Kruger, in 1978), where you know rather a lot, but either by misunderstanding what you know versus what you perceive others know (and denigrating your own knowledge), or by mentally highlighting all the known unknowns and undermining the value of the known against that. Impostor syndrome might be where you drag the valley of despair far further to the right by underestimating your perceived knowledge and confidence therein, not recognising your successful ascent of the Slope of Enlightenment. It's super common in people who pursue advanced degrees, people who change careers, and under-represented people in the workforce. It's quite hard to separate out the Valley of Despair from deep impostor syndrome, but you may be able to figure out where you are through using external viewpoints (What do your colleagues say about your knowledge or performance in the job? How highly qualified or experienced are you vs. others in your field?). It's probably best not to dwell on this dichotomy too much, but instead use it as an impetus to at least tackle the known unknowns - tracking along that wisdom axis in a positive direction, no matter what. </div><div><br /></div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjevlU7yATltMBnA0o1mJcUczQZx3zOqELN0Y3AnXuyV4hZtpeG_kzESwQFcmqd3bY5tFkiiMGOHEiaUMBUAerQewhYO4OwIT7kIuI-SgklbPcwYQIvaZLjF4u2x7hIPO5ybgIWh33w6X8/s600/dunningkruger-impostersyndrome.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="446" data-original-width="600" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjevlU7yATltMBnA0o1mJcUczQZx3zOqELN0Y3AnXuyV4hZtpeG_kzESwQFcmqd3bY5tFkiiMGOHEiaUMBUAerQewhYO4OwIT7kIuI-SgklbPcwYQIvaZLjF4u2x7hIPO5ybgIWh33w6X8/s320/dunningkruger-impostersyndrome.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Dunning-Kruger vs Impostor Synrome<br />Found at: <a href="https://ardalis.com/the-more-you-know-the-more-you-realize-you-dont-know/">https://ardalis.com/the-more-you-know-the-more-you-realize-you-dont-know/</a></td></tr></tbody></table><div><br /></div><div>In my own journey, inevitably the moment I think I've got somewhere, I take a look down the path and see a whole bunch of new things I've not yet learned, and slip into a more or less profound Valley of Despair for a while. That is both exciting (as someone who actively likes learning) and daunting (as someone who wants to "master" something). I know there are large areas of human knowledge in which I have little interest in developing expertise. There are others I know to defer to experts. I recognise that there are unknown unknowns, and they may blindside me on some idle Tuesday - but for the most part, I find that more knowledge always pays off, and that more learning points out and allows me to selectively fill gaps. </div><div><br /></div><div><br /></div><h1 style="text-align: left;">What does this mean for a career in IT, anyway? </h1><div>Just like the teenager (the classic exemplar of the "knows everything whilst knowing nothing" trope), or the recent (under)graduate, professionals risk falling into the same traps. Think for a moment about your own journeys in technology, or life more broadly. Can you remember a time when you thought you'd basically mastered everything to do with an area of IT - yet if you think about it now, you laugh to yourself, and consider how very deluded Younger You was? There you can see a perfect personal example of sitting astride the summit of Mount Stupid. You may even have some horror stories of the messes you got into from those lofty heights of neophyte over-confidence.</div><div> </div><div>Let's go a step further. Reflect for a moment on how well you think you know a topic. Pick anything, I'll wait... </div><div><br /></div><div>Right, so where do you think you are on that curve with that topic? Be honest with yourself. Are there actually large areas you've ignored intentionally, or even don't know exist at all? How can you find out? Well, here is an area where qualifications help. If you have a particular qualification (or they exist on that topic), go look at the course content for the next highest level. How well do you think you know that stuff? Is it all new and unknown? Something you only have a vague understanding of? Oh, great, you've already reached the highest level of that programme! Right, go look at a parallel stream, or a similar qualification for another vendor - or a platform or technology you've never played with. Is that certification only an entry level programme? See if there is a higher level one. There are some areas that only a few people on the planet understand, once things get complicated enough. There are some things nobody knows!</div><div><br /></div><div>If you've not already considered yourself stuck in the Valley of Despair, it's possible that we've just plotted your location on GPS, bang smack in the middle, treacherous ground to all sides, and no way to pick our way back to the old glories at the pinnacle of Mount Stupid. It's disheartening, perhaps, but I think we all ultimately recognise that the summit of Mount Stupid is a bad place to be. </div><div><br /></div><div>If you're serious about a long term - successful - career, you need to figure out how to keep developing. IT is a particularly vicious field in this regard, because the industry likes to put the goalposts on an ever receding conveyor belt. If you consider the Red Queen of the Alice books, you're running just as fast as you can just to stay in the same place. Stop, and you'll be dumped off the end. Plan it out - set targets, realistic timelines and goals, and how you can get there in the form of a <a href="https://capd.mit.edu/explore-careers/career-first-steps/make-career-plan">career plan</a>. Make sure you discuss this with partners, as achieving it is likely to significantly eat into time at home, particularly where there is little or no active development at your day job. They may also have to put up with a lab environment in the home, which can be a contentious point! Recognise there is a need to balance career/work and life, and nobody else is going to help you get there - and always remember that "career" can risk undermining some elements of "life".</div><div><br /></div><div>Any good organisation should recognise this trend, and make sure they develop their staff. Of course, your long term career aspirations and their desires for you in a particular position may not meet 100%. You will need to find a balance between work provided professional development activities, time you spend during working hours extending your knowledge, "leisure" time you devote to study, and all the other things you need to do for "life" in order to achieve that most tenuous and nebulous of goals - work/life balance. If the organisation you're in doesn't provide learning time for you, or actively support learning activities through providing books, equipment to "lab" on, mandatory training or financially supporting elective training you desire (and giving you paid study leave to do it), then this might be a significant downside to working for that employer - this may be bad enough to mean you need to look elsewhere, or something to emphasise in your search for jobs, or during interviews or benefit negotiations. </div><div><br /></div><div>Realise that as much as you may improve and get better and better at your job, and start to gain knowledge (and even experience) beyond your job description, there is a common trend in many companies that there isn't much internal promotion; either staff turnover is low so they can't promote, or they simply don't provide internal promotions when positions are open, or "grade bumps" or other incentives to retain staff. Modern HR tends to also focus on positions, not people (arguably the very opposite of the first core principle of Agile!); unless the department can prove they need a person in role X, they can't create it just to keep you happy. I've been in meetings where HR people say this out loud as inflexible and absolute policy. Of course, if a department wants to entirely restructure around this, there may be some wiggle-room - but it still means they need to display a need for the role - regardless of who fills it. It also partly explains why job descriptions and the "in the trenches" experiences are often a little different; HR picks and enforces "standard" job templates, but the they don't quite meet reality in a company. This means you inevitably have to "hop" laterally - and slightly upwards - to get to your desired positions, particularly if you have aggressive goals. You will often have to decide whether staying where you are is a good thing or not - and realise that comfort in a particular position may be an analogy to the intellectual Mount Stupid! Your career plan should keep you on track and regularly assessing your own progress toward those goals. </div><div><br /></div><div>You may also be quite happy to stay in more junior roles, or even not be able to get above a certain level for various reasons - and that is 100% fine, but make sure you give it significant thought and discuss it with your partner; consider potential "career plateaus" that you may not be able to rise above without significant changes. There is a lot of horse-trading in achieving a work-life-progress balance that works for you - and your family. And hey, your ideas or vision may change, that that's also perfectly fine!</div><div><br /></div><div>Recognise that a career plan in IT itself can't be static - new jobs, roles and even industry sectors appear as if by magic. You'll also change over time. Make sure you spend at least a little time regularly examining major trends and how those are likely to (positively and negatively) affect your career and the organisation(s) you work for - and use those insights to adjust that plan, and the insights, advice or direction you give to your organisation or people you may end up mentoring. </div><div><br /></div><div>Any time you really think you've mastered a topic, go and hang out with people who know more than you do - either by attending a conference, or joining a high level technical mailing list - either in something you think you're pretty good at or something you're learning. You will rapidly be shown just how little you know about a field, or the day to day intricacies of theory in practice! If you really and truly have reached the end of the route of all knowledge in one track (that's some chutzpah there...), you can either wait 3-5 years and have everything change, or you can start applying your learning skills to a new road of intellectual discovery - or both... </div><div><br /></div><div>"I don't know, but I know <i>how to find out</i>" might be the ultimate goal in life as a person in a technical field - particularly at the bleeding edge and with a new challenge. Of course, if it's something basic, well, you might feel a little embarrassed at needing to trot that sentence out, but at least you can ultimately get it done!</div><div><br /></div><div><br /></div><h1 style="text-align: left;">What does that mean in Life? </h1><div>Ah, the big questions. You've surveyed a microcosm of the rich universe of human experience and knowledge. Recognise that no matter how hard you try, you will never truly reach the end of the Plateau of Sustainability, and that there are probably a lot of Mount Stupids waiting for you. </div><div><br /></div><div>What to do? </div><div><br /></div><div>Work on harnessing other people's knowledge - do this by empowering yourself through outstanding information literacy, a commitment to continuous personal (and professional) development, cultivating a rich network of experts (including, and perhaps particularly, those whose views you find alien or difficult), and learn humility and a habit of listening to others. It should be immediately obvious that diversity in all senses is important here.</div><div><br /></div><div>Recognise you'll get stuck in knowledge gaps - there, you need to realise that you either need to train yourself, reach out to a subject matter expert (and likely pay them), or stop, reverse and get out of there. The third option (give up) is rarely the right choice. Picking between the other two is really a risk/reward exercise (and recognise that your judgments as a neophyte in that area inevitably carries more risk) - is it personally worth a likely multi-year journey to master that topic to any reasonable degree? In many cases, it might not be. In others, it might be, but you need a well considered answer <b>now</b>; in others it may be, <i>and </i>you can afford the time to work at the knowledge. It is particularly valuable to be able to recognise when a "surface" understanding of a topic is good enough, and where is is likely to be a problem - of course, you're most easily able to assess that once you are well on the path to mastery...! </div><div><br /></div><div>If you've not paused to consider ways to measure up the gulf between Mount Stupid and the Plateau of Sustainability, there's always a big clue in how long it takes to become a qualified expert in that field. Medicine requires many, many years of study and practice before someone becomes a junior doctor. Look to people you admire in your field, and consider how long it took them to get there. There are many further years of expertise, experience and training waiting as they become more specialised. You'll also note that cardio-thoracic surgeons just aren't going to quickly attempt neurosurgery, and there is a healthy system of referrals within the medical industry, recognising that nobody knows everything (not even Gregory House). What you can google about a medical topic is not likely to bring you up to a level playing field with an actual professional. Consider what might be the qualification of the expert you need for problem X? Do you need a lawyer or an actuary? An accountant? A statistician? What are the parallels in your profession? There are similar distinctions in IT - people who specialise in particular parts of the infrastructure (storage, network, compute), or the practice of using them (dev, ops) - and the myriad of different "flavours" of each of those things. IT is unusually "porous" in allowing you to quite easily jump into different tracks here; it is much harder for a "real" Engineer to jump between Civil to Mechanical or Electrical engineering, for instance, than it is to jump from "Network Engineer/Architect" to "Software Engineer/Architect", or something else (or <i>vice versa</i>). Likewise, doctors and surgeons don't simply change track within a year or two. I agree with those who think using the term "Engineer" or "Architect" as a job title without a professional standards body endorsing that is unwise and perhaps even disingenuous; however, in many countries this is both allowed, and commonly practiced. IT is still surprisingly young as an industry, and relatively unregulated (surprised?). Perhaps we will eventually have more rigorous standards for these nebulous titles, but until then, we must make sure that we personally try to live up to the high professional standards that "real" Engineers and Architects meet and display. Certainly, the importance of IT in modern society suggests we ought to move in that direction with some enthusiasm. That may make it harder to get into IT (because there will be more formal - and doubtless expensive - education requirements), but it will probably be good for the industry in terms of professionalism and robustness.</div><div><br /></div><div>A particular point of humility that bears consideration is that of "business context". We are prone to two things in IT - wanting shiny baubles, and resistance to change (because Risk!). You need to understand that IT is there to provide a service to an organisation; IT is not in and of itself (usually) the goal of the company - but is increasingly important in how the goal is delivered. The goal is typically to sell products or services to someone - or the all too often ultimately destructive "maximise shareholder value". You don't necessarily need the latest or greatest toy, framework or technique. IT's role is to make an organisation's success as easy and effective as possible - with the best ROI, if not at the lowest cost, possible - and ensuring that executives are given the information they need to make sound decisions in business-centric IT contexts. This means that yes, you need to understand things about your organisation that are not purely IT - you need to listen to what the C-Suite are saying about the overall organisation, what marketing are saying about their target audience, and what your internal customers (colleagues) are saying about the services IT provides. This also takes time, but being able to understand these contexts and then deliver realistic solutions makes you a much more "useful" employee. </div><div><br /></div><div>Of course, there is only so much time in the day, and this is why you will see that people often seem to lose some of their technical chops when they move into more "leadership" positions - they have to onboard that context, develop management skills, and so on - which inevitably, along with lack of all day hands-on IT experience, may mean they're less good than they once were at <some IT skill> - but they're bringing something else to the table as a result. When you're developing your career plan, take a long, hard look at whether you want to stay in a "hands on" role, or take on "leadership" or management roles. Again, either path is equally valid, but it needs to be right <i>for you</i>. In most organisations, if income or power/responsibilities are key drivers for you, a move into management or leadership is inevitable - but you may miss that intricate tech stuff. Also, be aware of "<a href="https://noidea.dog/glue">glue work</a>" and make sure it is handled equitably! Sometimes, you can't answer that question (is leadership for me?) without experiencing it - this may require two lateral jumps - one into a leadership position, and perhaps one back out of it if it is not your "vibe", but you should be able to spare yourself that anguish by more carefully reflecting on what you do and don't like, and carefully investigating what life is actually like in such positions - perhaps even by thinking laterally and taking a leadership position outside of work in a charity, club or some other voluntary capacity.</div><div><br /></div><div>The intellectual discomfort of knowing Mount Stupid is there waiting for you should be harnessed in making sure you push yourself right off that comfy summit and undertake profound life-long learning in all areas that make sense in your personal and professional life. </div><div><br /></div><h1 style="text-align: left;">tl; dr:</h1><div><ul style="text-align: left;"><li>You probably don't know as much as you think you do. </li><li>You certainly don't know what you don't know you don't know. </li><li>You definitely need to do something about that. </li></ul></div><div>#LearnMoar. </div><div><br /></div>James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com0tag:blogger.com,1999:blog-5949954951402579361.post-70025254822268231012020-03-25T17:13:00.005+00:002020-07-05T21:26:16.272+01:00Tune your home network for Work-from-Home / Quarantine / Isolation - "Un-Suck your Wi-Fi"!I can't authoritatively comment on COVID-19, despite a lot of years in the biological sciences, even some microbiology and a long term, vague (perhaps somewhat morbid) interest in emerging pathogens.<br />
<br />
I certainly CAN suggest how you can tune up your home network to ensure it's going to be the best it can be to cope with you, your partner, flatmates, relatives and children (or whatever your domestic arrangement might be!) stuck at home and depending on the Internet as a life-line.<br />
<br />
More and more countries seem to be enacting (entirely justifiable) stringent isolation measures for many weeks at a time, so I trust this information will be of some small help during this difficult time.<br />
<br />
<a name='more'></a>
So, on to "un-sucking" your wireless home networking!<br />
<br />
<br />
<a href="https://www.blogger.com/null"></a>Unlike my other posts, this is really aimed at home networks and non-IT professionals, and I'm going to try and take a fairly non-technical approach (as much as that might be possible for what is, at its heart, a tech issue!), with an aim to help non-technical users who have a little determination to learn something new and improve things! There is quite a bit to cover, but do plough though it.<br />
<br />
We'll run through a few possible goals, ranging from essentially free, to possibly quite expensive.<br />
<br />
<h2>
Start Where You Are (free)</h2>
<div>
It is incredibly likely that if you're reading this, you have a home internet connection of some kind, which has built-in wireless capabilities. It is also likely that with everyone at home and trying to use the Internet, there are going to be some frayed tempers when there is spotty internet coverage in your home. Certainly, when our cable internet failed on Saturday, there was much wailing and gnashing of teeth here. Fortunately, it came back...</div>
<div>
<br /></div>
<div>
So what can you do for little to no money to improve matters? </div>
<div>
<br /></div>
<h3>
Argh, this is all too complicated. Is there a single, simple thing I can do?</h3>
<div>
Here are two - if you do nothing else, these may alleviate some of the pain for very little trouble. </div>
<div>
<br /></div>
<div>
<ol>
<li>Power-cycle your "router". Unplug it, leave it unplugged for about 2 minutes or so and plug it back in again. It will usually take several minutes (up to 20 on some platforms!) for the Internet to start working again, but this can fix a lot of common glitches, and will often cause the router to pick a different (often better) channel without you knowing why your Internet now sucks less. It can also help sort out problems related to memory leaks in your router firmware, or poor sync settings between the router and ISP. Magic!</li>
<li>The other totally non-technical approach is "being in the same room as the "router" helps a lot" - move yourself closer to the router / access point. In a single person home, this is really simple. Sit next to the Wi-Fi equipped router, or even plug in an ethernet cable between the router and your laptop/PC!</li>
</ol>
</div>
<div>
Now on to some more in-depth learning and adjustments...</div>
<div>
<br /></div>
<h3>
Pick the best channel</h3>
<div>
First of all, particularly if you are in an urban area, notably in a block of flats / apartments, it is well worth your while checking that your home router isn't being stupid in its choice of wireless channels. To do this, you'll need some (free) tools to "see" something invisible - the radio waves that make Wi-Fi work. </div>
<div>
<br /></div>
<div>
If you don't already have one, download some kind of wifi scanner tool for your smartphone or laptop. On Android, I quite like <a href="https://play.google.com/store/apps/details?id=com.ubnt.usurvey">WiFiman</a> or the classic <a href="https://play.google.com/store/apps/details?id=com.farproc.wifi.analyzer">Wifi Analyser</a>. On Windows, <a href="https://www.metageek.com/products/inssider/">Metageek's InSSIDer</a> does a decent job (you'll need to register an account to get it to work these days). On the Apple iPhone/iPad side of the coin, I can't recall any good, free applications. In Mac land, there is the great <a href="https://www.adriangranados.com/">WiFi Explorer</a>, for which a limited free version is available. </div>
<div>
<br /></div>
<div>
To use such scanners, you need to understand a little theory. Firstly, just like an FM radio, there are <a href="https://en.wikipedia.org/wiki/List_of_WLAN_channels">channels</a> you can scan through; each has a specific frequency (in Gigahertz, GHz, although usually given a channel number like 1, 6 or 11, or 40 and so on). You also need to know that the channels are a certain width (in Megahertz, MHz) - wider channels use up more of the available spectrum. You also need to know that if two or more devices share the same channel, they will interfere with each other. If two adjacent channels overlap with each other, then they will also interfere with each other! This means if you neighbour's wireless is using the same frequency as yours, then they will interfere (and you should also logically deduce that the more devices use your Wi-Fi, even in an isolated area, the worse it will be - similar to how it gets very hard to hear other people talk in a room full of other people talking). </div>
<div>
<br /></div>
<div>
There are two major frequency bands; the so-called 2.4 and 5 GHz bands. 2.4 is the "older" and more commonly used band - pretty much all wireless gear supports it. 5GHz is more modern, and supports potentially higher speeds. Particularly in urban areas, there is often more "room to breathe" in the 5 GHz frequency band, unless your neighbour's router thinks 160 MHz channel widths are a good idea (protip: they are most certainly not; even if your router supports them, avoid!). This is both because fewer people use the 5GHz frequency range (although that advantage is vanishing quickly) and because there is way more spectrum available - there are only 3 non-overlapping 20MHz wide channels in 2.4 GHz. In 5GHz, there are at least 20, depending on the regulations where you live. </div>
<div>
<br /></div>
<div>
So what do you need to do with these newly installed apps and this newly acquired knowledge? </div>
<div>
<br /></div>
<div>
Well, step one is to know what your home network's name (SSID) is. That's the name of the network you connect all your devices to. Armed with that information, you want to then open up your shiny new app and scan the airwaves around you. This can take a minute or two to complete. Depending on the app you use, you will eventually be greeted with a screen that shows the surrounding networks (you may have to click/swipe through the options to find the most helpful screen). The apps are pretty intuitive, I think - they show the strength, channel and channel width of all the networks they can "see" from where they are - results will likely change as you move around a room and certainly a home.</div>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgW9zpnoCQwJ5HMPnd1G7vsI6HI9TdgkfFHevuhaG19KsQD1VGUO96q12j4VvprGUI3gL9SDWZ0uGho9HK8a5LeQFmViloDMO94uNOwfLn9Khk8k7cE6Hf41XkRPEk4WVQvpl0YmZc4t_o/s1600/Screenshot_20200323-210352_WiFiman.jpg" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1357" data-original-width="1409" height="308" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgW9zpnoCQwJ5HMPnd1G7vsI6HI9TdgkfFHevuhaG19KsQD1VGUO96q12j4VvprGUI3gL9SDWZ0uGho9HK8a5LeQFmViloDMO94uNOwfLn9Khk8k7cE6Hf41XkRPEk4WVQvpl0YmZc4t_o/s320/Screenshot_20200323-210352_WiFiman.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">A screenshot from Ubiquiti's WiFiman showing the local 2.4GHz environment.<br />
Note the peaks on channel 6. A number of networks are overlapping (not ideal, but inevitable in a block of flats).<br />
It's hard to see, because they are similarly strong, but there is a VM7705786 and MikroTik network sitting on top of each other, which is not great.<br />
Note the "inconsiderate" neighbour TalkTalk3456EC using a 40MHz channel width, which overlaps into channel 6.<br />
Moving our WiFi network to channel 1 should be the best bet, given this information. </td></tr>
</tbody></table>
<div>
<br /></div>
<div>
In the screenshot above, you can see that we have some overlapping networks and even an inconsiderate neighbouring router using a 40Mhz channel width. 40 MHz channel widths should NOT be used in 2.4 GHz networks. Only channels 1, 6 and 11 should typically be used (because all the others overlap with these, which is bad) - and only use them with 20MHz channel widths. We can probably make our wireless better simply by moving to a less busy / loud channel - in this case, channel 1. </div>
<div>
<br /></div>
<div>
Almost every home router will attempt to best guess channels, but they won't necessarily do so all that effectively. You can usually beat most home routers in their guessing games. Some may be persuaded into choosing a new channel simply by being rebooted, but if they're not good at this game, it may not help. However, you'll usually find a way of manually choosing a channel if you login to the administrative interface of the device. A common pattern is all the routers in a neighbourhood try this guessing game at the same time following a power cut, and many of them choose the same channel - and stick to it. This does not work terribly well. </div>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgn2joHjIgMJ56sr5gi12U_89hRTMCkiiBBTlGlTrxk-pzWbfxMX7qWldN558Lfg9hOy7CdavjthpZHbIb6MFhYreLwKGHfabBMhyrIcVG0tT25yr4rgbgV8K0ANtzh-CtADBlDUOALa60/s1600/VirginSmartHub3.0.PNG" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="928" data-original-width="1297" height="228" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgn2joHjIgMJ56sr5gi12U_89hRTMCkiiBBTlGlTrxk-pzWbfxMX7qWldN558Lfg9hOy7CdavjthpZHbIb6MFhYreLwKGHfabBMhyrIcVG0tT25yr4rgbgV8K0ANtzh-CtADBlDUOALa60/s320/VirginSmartHub3.0.PNG" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Manually setting wireless channels on a Virgin Media Hub3.0 <a href="https://en.wikipedia.org/wiki/DOCSIS">DOCSIS</a> cable modem</td></tr>
</tbody></table>
<div>
Above, you'll see the manual channel settings on a home router. Every model is slightly different - if in doubt, follow the manufacturer's instructions, or see if your ISP can help point you in the right direction (googling the router model is a good bet if you're not sure where to start!). Often, there are instructions (and login details) printed on the bottom of the router on a label.</div>
<div>
<br /></div>
<div>
I told it to use Channel 1, with a 20Mhz channel width for 2.4GHz Wi-Fi; I also told it to use Channel 60 in the 5GHz band. (I've not shown the graphs for 5GHz, but this process results in a completely clean channel for 5GHz in our flat, because all of our neighbours are using low, non-<a href="https://en.wikipedia.org/wiki/Dynamic_frequency_selection">DFS</a> channels). </div>
<div>
<br /></div>
<div>
Below, we can see what happens after applying the new settings. </div>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgjmoM6toANdvA3QdDhu3mLg8GNI7byXY4Gau0YuAAYSe5ArSJYkPz2s2dGWb8bq1-4xYP4dFTbOSWEsPn5M7ScpixycuTxf6fTgB7Ea5CI3OHTK45xD13bQ9-H3n7cPGDF1Z6wnaiPzvc/s1600/Screenshot_20200323-223114_WiFiman.jpg" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1478" data-original-width="1341" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgjmoM6toANdvA3QdDhu3mLg8GNI7byXY4Gau0YuAAYSe5ArSJYkPz2s2dGWb8bq1-4xYP4dFTbOSWEsPn5M7ScpixycuTxf6fTgB7Ea5CI3OHTK45xD13bQ9-H3n7cPGDF1Z6wnaiPzvc/s320/Screenshot_20200323-223114_WiFiman.jpg" width="290" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">New settings applied. Note the MikroTik and VM7705786 networks no longer overlap with similar strengths. <br />
Hurrah, they will both respectively suck less.<br />
NB this is not necessarily the optimal solution here - but it *is* better than the previous one.<br />
Due to changes by other networks, channel 11 now looks better than 1 or 6 - at least in that position in that room and at the time of measurement.</td></tr>
</tbody></table>
<div>
And now, we have moved our network into a channel where it has a better signal-to-"noise" ratio. </div>
<div>
<br /></div>
<div>
Note also that the EE-BrightBox that was making channel 11 a bad idea has disappeared - and moved into channel 6, making 6 suck for the Mikrotik network! You cannot assume just doing this once will sort you out forever - it's now a regular household chore to make sure you've chosen the best channel (sorry!) - or conversely, if you note your Wi-Fi "sucks" more than usual, it might be time to check that your neighbouring routers haven't moved into the space your network is using!</div>
<div>
<br /></div>
<div>
If you want to look into this more, the technical terms of relevance are "co-channel interference" (CCI) for all those using the same channel, and "adjacent channel interference" (ACI) for all those neighbouring channels that overlap slightly. </div>
<div>
<br /></div>
<div>
You should go through the same exercise if your router does 5 GHz, too. Recently, I found virtually every wireless network in this building was using a single channel - meaning the other channels were free and clear - so my 5GHz wireless is now pretty good, because it is quiet, aside from our devices! One key difference between 2.4 and 5 GHz Wi-Fi is that typically, 5GHz is less good at punching through walls and things - it appears to attenuate more quickly with obstructions (in other words, its coverage looks worse). This is advantageous when you have many access points, but is of course annoying if you're trying to only use one to cover a large area. </div>
<div>
<br /></div>
<div>
In summary, </div>
<div>
<ol>
<li>Use a Wi-Fi scanner app to visualise the surrounding wireless environment.</li>
<li>Pick the least crowded channel - the one where the "gap" between your network and the neighbouring network is the largest, both in terms of having a 1, 6 or 11 channel unimpinged by someone using wide channel widths or intermediate (2,3,4,5,7,8,9,10,12,13,14) channels, and in terms of the "gap" between the signal strength of your Wi-Fi network and your neighbour's network(s) using the same channel.</li>
<li>Only use channels 1/6/11 in 2.4Ghz with 20Mhz channel widths.</li>
<li>You may want to redo this in several places around your home, as every room will have a slightly different radio environment and you probably want to improve the whole household. Take notes!</li>
<li>Remember you may need to repeat this exercise periodically. </li>
<li>Finally, if you have the facility, do the same exercise for 5Ghz. </li>
</ol>
</div>
<div>
<br /></div>
<h3>
Location, location, location</h3>
<div>
You may find moving your router / Access Point (or devices whose Wi-Fi sucks!) makes huge differences to your experience. Radio waves are absorbed, scattered and reflected by many materials around your home. Metal tends to block wireless (refrigerators are huge chunks of metal); building materials tend to attenuate (absorb) radio waves, weakening the signal, and some may cause the radio to bend (refract) or reflect (bounce), which can cause multipath interference. Human bodies are full of water, which blocks wireless - so you being between your device and the access point / router can cause you trouble. Even small changes (a few centimeters) particularly for handheld devices can make a difference. Your app will show you how "strong" your wireless signal is. </div>
<div>
<br /></div>
<div>
So what to do about it? My first suggestion is to wander around your home looking at the signal strength in each room, looking to see if it is adequate with one of the aforementioned apps. In my book, you want at least -70dBm or higher (smaller numbers, closer to zero, are better in dBm). If it is worse than that, take note of where that is the case. To underscore this point, -80 is really weak, -90 is basically unusable, -70 is OK, -60 is good, and -40 is about as strong as you can ever see it in the field. You're aiming to try to get -70 or better everywhere you want to use Wi-Fi; -60 or better if you're doing a lot of videoconferencing or packet-loss sensitive applications, or VOIP. The scale is logarithmic, which means relatively small changes in numbers are actually big changes - a 3dB change is twice (or half) as strong. </div>
<div>
<br /></div>
<div>
If you want to take this to the next level, you can use a heatmapper program, where you load a sketch or blueprint of your house (floor by floor) into the program, and take measurements of wireless strength, building up a map. Interpreting these takes a little practice, and collecting the data rigorously certainly does. In windows, you can try Ekahau's <a href="https://www.ekahau.com/products/heatmapper/overview/">heatmapper</a>, or you can probably find something for another platform. But, to be honest, you can do a more than good enough job simply by noting signal strengths as you walk around!</div>
<div>
<br /></div>
<div>
So what do we do with these numbers? Well, you'll probably notice some "deadspots" in your coverage - you want to try and fill those in if you can. The only way to do this for minimal cost is either to try moving the router to cover them, or avoid them. My suggestion is to ensure you cover the most "critical" areas of your home first - that tends to be the living room, and any home office(s) - or spaces currently used as home offices. Prioritise surveying those. </div>
<div>
<br /></div>
<div>
Now that you have a "baseline" survey, you change things (move the router) and re-survey to see what the results are. Although Wi-Fi is pretty magical, it isn't <i>magic</i> and strictly obeys the iron-clad rules of radio physics. There are some things that no amount of moving a single access point or wireless equipped SOHO router will fix. Large homes are hard to cover with a single access point, and some homes have challenging constructions, fittings or layouts. Your surveys will help to show you if this is the case, but as a rule-of-thumb, you're lucky to get coverage in a 30m (100ft) radius around a router indoors; 15m (50ft) is often more realistic. Outdoors, you can expect around about 100m (300ft). </div>
<div>
<br /></div>
<div>
However, if you have your router at the edge of your home, see if you can get it positioned closer to the centre of your home. If it is sitting right next to something that blocks signals (a refrigerator, boiler, radiator, or other large metal object), move it away from that. If it's next to the microwave or some other device that uses 2.4 or 5GHz signals, move it. If you've hidden the router in a cupboard, take it out of the cupboard. If the router is on the floor, pick it up and put it higher up! Try a number of different positions to see which offers the best coverage. In the event members of the household find routers un-aesthetic, you may need to negotiate some tricky political waters. I only offer technical solutions here - the domestic politics is up to you! :)</div>
<div>
<br /></div>
<div>
I actually solved a "mystery" wireless issue for my mother-in-law once. She noted that tea and wifi did not play together well. It turned out that "tea" in a busy household often meant your tea got cold, and you heated up your cold tea in the microwave. With the router right at the other end of a massive home, the microwave (when on) swamped the router's signal (they use the same 2.4 Ghz frequencies, and microwaves are a HELL of a lot stronger than wireless Access Points). We fixed that by adding an access point in the kitchen, which resulted in a signal that could effectively compete with the microwave. Tea AND facebook. Ah, the luxuries of the modern world! (Note that if you have an AP in the same room as the microwave and it still interferes, you should almost certainly replace that microwave!).</div>
<div>
<br /></div>
<div>
Another "location" factor you should be aware of is that devices with slow connections (because they are on the margin of vaguely usable wireless) can drag down the speed of your entire network. If two devices need to receive 100 megabytes of data, and one receives data at 100 megabytes a second, it will be done in one second. If another needs that same data, but is only downloading at 1 megabyte a second, because a marginal connection means it needs to work at a slower rate for reliable transmission, it will take 100 seconds. Remember that while one device is using the channel, it's <i>blocked for all other uses</i>. In reality, it's not quite this bad (they'll often take turns, transferring the data in chunks called "packets") - but it is a definite and noticeable effect ("slowness", choppy or jerky video/audio, painful remote desktop experiences). This means if there are rooms where devices get really bad reception, you may want to encourage fellow home-dwellers to avoid using those spaces for wireless-related activities for the greater good. Worse than that, marginal Wi-Fi often fails to correctly send data, so those packets of data need to be re-sent - possibly doubling (or worse) the amount of time it takes to send/receive that data!</div>
<div>
<br /></div>
<div>
Low signal strength means slow, possibly unreliable Wi-Fi. Avoid!</div>
<div>
<br /></div>
<div>
In summary:</div>
<div>
<ol>
<li>Ensure you've first optimised your channel allocation (remembering that a single room measurement won't necessarily give you the best answer, as different rooms will likely have different networks interfering with yours). </li>
<li>Measure (survey) Wi-Fi coverage around your home, and note down the current values.</li>
<li>Consider better placement of your router / Access Point to better cover the space.</li>
<ol>
<li>Raise it higher</li>
<li>Take it out of a cupboard / closet(!)</li>
<li>Move it away from things that block or interfere with wireless signals</li>
<li>Move it closer to the centre of the desired coverage area, if possible. </li>
</ol>
<li>Each time you make a change, redo your survey to see if things are better or worse. Take notes!</li>
<li>Consider if you can adapt usage to stick to well covered areas and avoid areas you cannot improve. </li>
</ol>
</div>
<div>
<br /></div>
<h3>
Some Devices Just Suck.</h3>
<div>
It is possible that you have one device that just seems to have terrible Wi-Fi everywhere. That likely means it is broken, and should be repaired or replaced. Some models of device are just really bad - terrible antenna design (remember <a href="https://en.wikipedia.org/wiki/IPhone_4#Antenna">antennagate</a>?), poor radio choices, or bad firmware. There is unfortunately nothing you can do about that, and should return it as unusable. It is also possible to find devices that break your network - if you note a strong correlation between device X being connected and terrible performance generally, think poorly of device X and remove it (permanently).</div>
<div>
<br /></div>
<h3>
Mo' devices, Mo' problems...</h3>
<div>
Wi-Fi is a victim of its own success. The more devices use a wireless network, the more congested it becomes - the airwaves get like rush hour traffic, and things eventually grind to a halt. With many people now running around with about 3 personal devices on Wi-Fi (smartphone, tablet, laptop), increasing numbers of "smart home" devices, cameras, smart TVs, smart toasters, smart fridges, smart lights, smart toilets(!) and the like, there is a lot of competition for scarce airtime in a busy wireless network. If you are in a household that sounds like this, and your Wi-Fi sucks, unfortunately, you are likely to need to move immediately onto options that involve spending money fixing the problem (by adding more wireless access points closer to the devices, spreading the load). For no/low cost, any device that is turned off, and any device that is moved onto a wired connection and has Wi-Fi disabled, no longer contributes to the problem. </div>
<div>
<br /></div>
<h2>
Collaborate and promote visibility</h2>
<div>
There are some things you can do with other people to collectively improve things, and highlight that there are some improvements everyone around you can be making to improve home networks, particularly in high occupant density buildings. </div>
<div>
<br /></div>
<h3>
Be a good Wi-Fi neighbour; help your neighbours reciprocate</h3>
<div>
Because Wi-Fi doesn't magically respect property boundaries, you may need to encourage your neighbours to get with the programme on appropriate channel use and channel width. If you have an existing mailing group or WhatsApp group for your neighbours, that is a good, low risk way of communicating the desired changes. </div>
<div>
<br /></div>
<div>
You can perhaps offer to help your neighbours over e.g. TeamViewer to optimise their networks too - to the benefit of all in your immediate surroundings. It is unlikely many of them are intentionally choosing poor channels or have elected to use unsociable channel widths (manufacturers of SOHO Wi-Fi offer that out the box!) - discussing how you can communally unsuck your Wi-Fi is perhaps the new "mow your lawn" of neighbourly relations! </div>
<div>
<br /></div>
<div>
Likewise, you don't want to "weaponise" your knowledge to the detriment of your neighbours, either - that's just not cool. </div>
<div>
<br /></div>
<div>
If you're not in contact with your neighbours, and there is no existing neighbour meeting group/association, you may want to consider (safely) "mail-shotting" your complex through letterboxes. Make sure you minimise risks to your neighbours when you do so ("life" of SARS-CoV-2 seems to be around 72 hours on paper/cardboard!). Make sure you give them contact details to get in touch with you if they need help or have questions, and point them at suitable guides to help themselves. Don't accuse anyone of being in the wrong - you're quite likely to blame the wrong person, and that never helps neighbourly relationships. </div>
<div>
<br /></div>
<div>
Be the good neighbour!</div>
<div>
<br /></div>
<div>
The major communal "un-sucking" activities you can jointly engage in are:</div>
<div>
<ol>
<li>coordinated channel plans (remember Wi-Fi works in 3D - consider upstairs and downstairs!) - people jointly decide what channels each home should use, and they stick to that. </li>
<li>making use of the smallest usable channel widths (20 MHz in both 2.4 and 5.8GHz) </li>
<li>possibly trying more reasonable router locations (i.e. adjoining neighbours shouldn't have their APs against the same shared wall). </li>
<li>possibly, if supported, tuning output power on excessively "loud" devices. </li>
</ol>
</div>
<div>
If you have neighbours from hell, there is little you can do beyond trying to find the best channel you can - ethernet cables, of course, remain a technical (if not necessarily aesthetically pleasing) fix for most work devices that neatly side-step cluttered airwaves. </div>
<div>
<br /></div>
<div>
I wish the developers of apartment complexes would build <a href="https://en.wikipedia.org/wiki/Faraday_cage">faraday cages</a> into each unit at the boundary walls, floors, ceilings, windows and doors, but that would result in terrible cellular phone and commercial radio experiences (but make cohabiting wireless networks with many others much easier). It's really unlikely to ever happen. </div>
<div>
<br /></div>
<h2>
Focus on Value (low cost)</h2>
<div>
Eventually, you're going to need to spend some money on improving things, if the "free" tips don't solve the problem. </div>
<div>
<br /></div>
<h3>
It may be called wireless, but the more wires you use, the better....!</h3>
<div>
I hinted earlier that all devices using Wi-Fi interfere with each other - that includes all those in your home network, of course. Wi-Fi is inherently a half duplex medium - only one single device can ever be "talking" at once in a given channel. Your access point has to grab some airtime to send to client devices; your devices each have to take it in turns to send some data towards the access point - only ever one at a time and only ever in one direction. So, if you have a lot of wireless devices, your Wi-Fi will, almost inevitably, start to "suck". </div>
<div>
<br /></div>
<div>
A fairly cheap - and very effective - solution to this is to wire up those devices you can, particularly those that are always on, use a lot of bandwidth, or are "critical" infrastructure. Use an <b>ethernet cable</b> and connect up the ethernet port on your device to the ethernet port on the back of your SOHO router - most have at least 4. A strong recommendation, if your home entertainment system and router share a room, is to hook up your smart TV and/or gaming console or streaming box via an ethernet cable. They are available very cheaply pretty much anywhere that sells computer gear in a variety of lengths and colours. Get "Cat 6" ones, if you can, although Cat 5e are adequate for most uses. </div>
<div>
<br /></div>
<div>
In many cases, you may already have a short ethernet cable lying around the place - in the box the router came in, or in your "big box of random cables" - use it/them!</div>
<div>
<br /></div>
<div>
Each device you move onto a wired connection no longer competes for wireless bandwidth (particularly if you turn off Wi-Fi completely on that device) - instead, it has 100 (or possibly even 1,000) megabits per second of dedicated, bi-directional, full duplex bandwidth between the device and your router. This beats the pants off of pretty much all Wi-Fi, and scales nicely. </div>
<div>
<br /></div>
<div>
If you want to take the next step, then running wires to wherever you happen to do home office activities also makes a lot of sense - this can be a real pain in some homes, and always risks marital discord between those who scarcely notice cables and are pleased with their performance, and those who see cables as a hideous eyesore and don't care about performance. Most people aren't going to want to go through the hassle of installing conduit in the wall, or affix trunking on top of it. You can get flat (ish) cables, but those aren't really standards compliant, and I'm hesitant to recommend them - but if that is the compromise you can come to, it can be worth trying. </div>
<div>
<br /></div>
<div>
My rule of thumb on this is <b>"wire what you can, Wi-Fi what you must"</b> - some modern devices simply can't use wired networking!</div>
<div>
<br /></div>
<div>
Of course, you may also be able to use wires that are already in your walls. <b>Powerline ethernet </b>standards allow you to use electrical wiring in your home to distribute networking between a unit plugged into your home router, and another device somewhere else. Some powerline ethernet devices even have built in wireless access points to put up another "hotspot" in another room. I used one of these in my parent's house to get good internet in the downstairs lounge/kitchen area when my mom was sick and couldn't really get upstairs all that easily to get online on the PC or into the stronger wireless of the router up there. These make a reasonable option for people renting, in particular, or those in whose hands a masonry drill is a weapon of mass destruction, plumbing leaks or hit electrical cables! They're not typically quite as fast as dedicated ethernet cables, and there are some other gotchas (they may not work between different circuits, depending on your home's electrical systems). They can also <a href="https://www.ban-plt.org.uk/what.php">interfere with radio</a>. There is also arguably a marginal increase in risk from lightning damage (or faulty powerline ethernet gear, but most is fairly sensibly designed to not put mains voltage onto your device's network port...!) to your equipment. But for wiring averse homes, they're a potentially very useful tool. </div>
<div>
<br /></div>
<h2>
Progress iteratively with feedback (cost)</h2>
<div>
If you realise you're beyond where moving things around will help, you're going to need more gear. Eventually, you need more than one access point to adequately cover an area (look around most large offices and public spaces, you'll see quite a few APs scattered around, usually on the ceiling). If you think you're at that stage, you may need to start adding access points - add one where things are really dire, and then re-survey, tweaking as you go. If you still have a deadspot, add another - and so on. </div>
<div>
<br /></div>
<h3>
What about wireless wireless? (mesh networking)</h3>
<div>
As someone who has been professionally involved in the roll-out of wireless networks over several years, I could probably buy a round of drinks at a bar if I got money every time someone asked why there were so many wires involved in rolling out wireless... When you install proper "enterprise grade" wireless access points, each one needs (at least) one Ethernet cable to it. </div>
<div>
<br /></div>
<div>
But you can also use wireless as the "backhaul" between your access points and the router that sits between your home and the Internet. This is typically done using "mesh" networking, where the access points use one radio frequency to form a backbone between them, and others to give their associated client devices Wi-Fi. Typically, people call this "mesh networking", and it is an area of considerable growth at the moment. </div>
<div>
<br /></div>
<div>
If you're looking to "home install" a more robust wireless infrastructure without needing to drill through walls and mount things on the ceiling, this is as good as it is going to get. </div>
<div>
<br /></div>
<div>
Stick to equipment from a single vendor for ease of use (and so they don't blame each other when things don't quite go according to plan). </div>
<div>
<br /></div>
<div>
Some options:</div>
<div>
<ul>
<li>Ubiquiti's <a href="https://amplifi.com/">Amplifi</a> </li>
<li><a href="https://support.google.com/wifi/answer/7168315?hl=en-GB">Google Wifi</a></li>
<li><a href="https://www.netgear.com/orbi/">Netgear's Orbi range</a></li>
<li>some other reviews </li>
<ul>
<li><a href="https://www.techradar.com/uk/news/best-wireless-mesh-routers">https://www.techradar.com/uk/news/best-wireless-mesh-routers</a></li>
<li><a href="https://uk.pcmag.com/wireless-networking/87178/the-best-wi-fi-mesh-network-systems-for-2020">https://uk.pcmag.com/wireless-networking/87178/the-best-wi-fi-mesh-network-systems-for-2020</a></li>
</ul>
<li>MikroTik have recently launched the <a href="https://mikrotik.com/product/audience">Audience</a>, although if you're expected to set them up through RouterOS, they're going to be hard for non-IT enthusiasts to get them up and running.</li>
</ul>
</div>
<div>
When you roll them out, note that they have to be well within "reach" of each other in order to work well. If the backhaul radios don't have strong communication between them, you may experience problems. When you're optimising your network, if the platform allows, I would suggest you start by getting the backhaul radio channel as good as you can, and then use a different client-facing channel on each node in the mesh (one 2.4 and one 5GHz, ideally) - if you have more than 3, you'll have to start recycling 2.4 GHz channels. You should absolutely experiment with placing of the devices to optimise coverage and backhaul speed. </div>
<div>
<br /></div>
<div>
Note that in some cases, you may suffer from some of the problems of wireless range extenders (no dedicated backhaul radio), but most of these systems are designed to be "easy" to install. If they say something like "tri-band" that's likely to fully deliver on the promise of such systems (insomuch as wireless backhaul can), as they have dedicated backhaul radios.</div>
<div>
<br /></div>
<div>
In some networks, wireless backhaul may not be adequate. At this stage, you're going to need to roll out an "enterprise lite" wired backhaul wireless network! Most homes will not have this problem, however.</div>
<div>
<br /></div>
<h3>
There is a level of Wi-Fi use beyond which you cannot help but spend money...</h3>
<div>
This is where people will look at you funny, because you're running wires for wireless. </div>
<div>
<br /></div>
<div>
This is arguably the ultimate level of Wi-Fi sophistication in a home. You're running extremely heavy, data-intensive wireless access (or like to give all of your over-the-top house party guests wireless!), or are simply unwilling to put up with compromise, or - perhaps - you want a toy you can play with to learn more, or own a large property, or one with radio frequency propagation patterns that are not good (like the <a href="https://www.theguardian.com/technology/blog/2010/jan/02/wifi-walls-plaster-lath-wire-blocked">chicken mesh plaster layer</a> common in tech capital San Francisco), or have metre thick granite walls, or lots of rebar, etc. </div>
<div>
<br /></div>
<div>
If you're going to this level of effort/expenditure, unless you are someone that is extremely keen on doing the learning required to do it properly, and are keen to be running network wires throughout your property yourself, then it's probably best to contract an experienced wireless implementation team to do it for you. If you are keen to do this, yourself, this blog post isn't going to cover it in enough detail to be immediately successful (sorry!). It's entirely possible to spend money and still end up with suck-y Wi-Fi if not done right! </div>
<div>
<br /></div>
<div>
If you do get professionals in, demand they fully certify the backhaul cables (between your "network centre" and each AP or wall outlet) to the correct standard (likely cat 5e or cat 6) - whether or not they certify cable plant is the best mark of quality between professional installers and fly-by-night operations (in part, because cable certifiers are mid-range car purchase levels of money). They should give you a certificate/printout or datafile with all the results in, and all of the cables should pass the test - or else demand they rectify them! They should also ideally give you a post-installation survey map, showing the coverage of your property in 2.4 and 5GHz bands. </div>
<div>
<br /></div>
<div>
As a starting point to the sorts of things you will need to understand, in order to do so yourself, you'll need to figure out how many individual access points you need in your space, choose AP models, what the channel assignments and signal strengths should be for each access point, run wires back to a central location, possibly set up VLANs to isolate various networks from each other, have a PoE switch to power the access points and so on. And then verify and adjust based on what real life physics has put in the way of your glorious paper plan! You'll probably have to run a small server to manage the infrastructure. All of this takes at least some learning and ideally some practical experience - in most cases, if I.T. isn't your day job, or a passion / hobby, it's going to be better to leave this to someone else. If you are looking to take the plunge, for a modest system you can do worse than Ubiquiti's UniFi range, possibly with a <a href="https://store.ui.com/products/unifi-dream-machine">Dream Machine</a> at the heart of it. If you know someone who has a "day job" involving wireless networks, they might be prepared to give you some assistance. They'll certainly have some opinions! Do pay them for their time, and if they refuse, get them a nice thank you gift anyway. :) </div>
<div>
<br /></div>
<div>
If you go down this route and have a large home, <a href="https://www.troyhunt.com/ubiquiti-all-the-things-how-i-finally-fixed-my-dodgy-wifi/">this may be the sort of thing you're in for</a> if you DIY.<br />
<br />
Obviously, it's unwise, likely even unethical, to get people to come out and install things in your home right at the moment (March 2020) - you may need to do what you can based on what gear you can get delivered from online stores, using advice you can find online, and your own experimentation - and hopefully, appeal for calm within the household because there are "wires everywhere" until you can get things sorted professionally (or otherwise). </div>
<div>
<br /></div>
<h2>
Think and Work Holistically (Wi-Fi is not the only thing you need to think about)</h2>
<div>
There are things that "feel" like "bad Wi-Fi" that might not actually be the fault of the Wi-Fi. </div>
<div>
<br /></div>
<h3>
Internet connectivity</h3>
<div>
It should be obvious, but it bears further consideration. Your Internet connection is probably often the "bottleneck" in your connectivity and apparent network performance. If you have a slow internet connection, then no amount of "great Wi-Fi" is going to help for the vast majority of users, for whom connecting to resources on the Internet has become synonymous with Wi-Fi (such that I've been asked to "fix broken wifi" on a wired-only device, which is technically quite challenging, until you understand what they actually want!). This, regrettably, may mean you need to look at an alternative provider, or increasing your spend with your current provider. In an age of quarantines and lockdowns, you probably don't want to (and may not be able to) order new services - but you may be able to upgrade to a faster service on your current connection. </div>
<div>
<br /></div>
<div>
If you'd like some "rules of thumb" - </div>
<div>
<ul>
<li>4K HDR Netflix is around 25 megabits per second, per device doing this (ouch); 4k YouTube is similar</li>
<li>HD content is usually around 4-10 megabits per second, per concurrent stream.</li>
<li>SD Netflix is around 2-4 megabits per second, per concurrent stream (so for each device doing Netflix at once). Youtube is similar. </li>
<li>Most videoconferencing solutions will require around 2-4 megabits per second, but very high resolution ones may be more.</li>
<li>Audio-only streaming is more modest, usually less than 256 kilobits per second (i.e. ~0.25 megabits per second).</li>
<li>Loading static webpages (no embedded rich media) may take a few megabytes per page-load, but is "bursty", so several users can effectively share that bandwidth. Same for email. </li>
<li>The UN considers a "broadband" connection to be 1.5 megabits per second. I think this is low, and that should be at least 2.5 megabits per second <i>per concurrently connected device or user</i>, assuming each user is just viewing webpages and doing email. Add up the number of internet connected devices in your house and multiply by 2.5. Is your internet connection at least that fast in megabits per second? If not, it will sometimes (or always) feel slow. </li>
<li>Using Remote Desktop Protocol (RDP) to a computer in the office needs quite a lot of bandwidth (10mb/s for complex graphics, possibly more if trying to stream video[!]), and does <i>not</i> deal well with packet loss and latency. If you're doing that, I strongly suggest a cabled connection - like for our architect flatmate that uses the bedroom farthest from the router, and uses Revit remotely. Yesterday, they were incredibly frustrated until I pulled a 20m LAN cable out of a box and connected them up. Instantly usable, and, as a bonus, Netflix no longer sucked for them!</li>
</ul>
</div>
<div>
In summary of the above figures, streaming media EATS your bandwidth, and you may need to curtail its use by forcing devices to use lower quality settings, or persuading other members of the household to not use them during "business hours" if it's interfering with work-from-home. In jargon terms, the "contention ratio" for streaming media has to be 1:1 - if you need 25 megabits per second for adequate performance, <i>each user</i> needs that, all the time. For non-streaming media, the contention ratio can be higher - several people can share available bandwidth, because they're only using it some of the time (up to a point, of course). Obviously, this bandwidth requirement will apply to both your Internet connection AND the available performance of your Wi-Fi network - whichever is worse. Static web-pages and email is easy to provide on most modern broadband connections. A household full of people streaming (be it Netflix or video conferences) is much harder. Whatever networking you do - if you have limited capacity, streaming sucks!</div>
<div>
<br /></div>
<div>
To work out if you have enough Internet bandwidth, you need to figure out what the sum total of your "internet usage" is - what is the worst case scenario you expect to see? Add up the bandwidth of each of those, and see if the total is more than what you have. If so, you need to manage that, either by reducing or managing demand (not doing some things) or getting more bandwidth. </div>
<div>
<br /></div>
<div>
If the internet "feels" slow, you may be able to see how much of your bandwidth is being used, either on a webpage on your ISP's site, or on an administrative page on your router/access point (unfortunately, many home devices don't have this option). You can also try doing online speedtests (try <a href="https://fast.com/">fast.com</a> or <a href="https://www.speedtest.net/">speedtest.net</a>), but that is only a proxy measurement, and only validly measures your speed and performance when the only thing happening on that connection is the speedtest. </div>
<div>
<br /></div>
<div>
Remember that many home internet connections are asymmetric (ADSL, cable internet, some fibre packages, most satellite connections, many WISPs) - they have relatively fast downloads (from the Internet), and relatively slow uploads (to the Internet) - so you can download files, emails, etc. quickly, but uploading content, or sending emails or large files seems to take ages. Another area limited upload capacity can be an issue is if you are videoconferencing and trying to send high resolution streams of your desktop or webcam to the conference. For many people, this asymmetry is not an issue. However, if this is a significant problem for you, you may want to investigate if you can use symmetrical Internet services where you live, where the upload and download speeds are the same. This is most likely to be available at high speed over fibre internet (where available). </div>
<div>
<br /></div>
<div>
You may like to ask your employer about subsidising your internet connectivity! A simple way for them to get this done is to hand out business-only portable wireless hotspots, or use cellular modem equipped laptops. Of course, what is simple day-to-day gets rather complicated at scale as an afterthought, and during a pandemic lockdown! It's certainly easier for them to transfer you money to cover your existing home connectivity costs. </div>
<div>
<br /></div>
<div>
If you can get it, fibre is always the best choice from a technical perspective. Next up, probably cable internet (not only from cable TV companies). Next, probably ADSL, although that's getting pretty slow these days. Cellular / LTE can be usable, but suffers from some of the same problems as Wi-Fi (it's bound by the same laws of physics and half duplex, although there is more centralised control of spectrum use) - and is usually expensive. Satellite internet tends to be slow, expensive and extremely high latency. In some areas, wireless ISPs (WISPs) may be an option too. </div>
<div>
<br /></div>
<div>
But, of course, any Internet connection is better than nothing - you may simply have to change how you use it. You're obviously somewhat limited to what is actually available where you are (and what you can realistically afford). Pick the one that offers the best price/performance ratio for you. </div>
<div>
<br /></div>
<div>
In summary:</div>
<div>
<ul>
<li>Streaming video is the hardest thing you can ask your network and internet connection to handle.</li>
<li>You may need to manage demand by staggering video conferences and "entertainment" streaming. If your connection is limited enough, you may need to stagger video conferences between different people working from home. </li>
<li>Remember your capacity planning needs to consider the "worst case" realistic scenario in your household of number of concurrently connected devices and what they can do. </li>
</ul>
</div>
<h3>
</h3>
<h3>
Backup internet connectivity</h3>
<div>
Most people have a smartphone with mobile internet connectivity (3G, 4G, LTE, 5G) - you can use this to get online on the device, but most also support "hotspot" mode, which will allow you to share that connectivity with another device in the room (such as your laptop for that vital work email or conference call). </div>
<div>
<br /></div>
<div>
Unfortunately, mobile data is usually expensive and often capped, so you should typically reserve this for emergencies or "critical" use; you may also want to think about what you use it for (it's usually adequate for non-streaming internet use and email - it will get clobbered if you play netflix or youtube etc. all day long, or have long videoconferences). Most work-related stuff (which all too often comes down to emailing stuff around) is relatively low bandwidth. This can get really expensive in developing world economies, in particular (I dreaded using my monthly 500MB of mobile data in South Africa. Here, I don't really think about it - I get 4 gig, 100 minutes and unlimited SMS for ten pounds - but I also don't stream over it). </div>
<div>
<br /></div>
<div>
Remember that "hotspot" modes and mobile cellular router / hotspots all broadcast Wi-Fi signals, so they may make the general wireless environment around them worse for others (I've fought this battle in school and university residences / boarding houses; it's a real pain). If you need a better connection just for your work laptop, see if you can get one with a built in cellular (3G / 4G / LTE / 5G) connection (i.e. you can put a SIM card into the laptop), or if you can get a non-Wi-Fi cellular USB device. Of course, if all you have in a phone in hotspot mode or a personal "MiFi" cellular hotspot, you use what you have!</div>
<div>
<br /></div>
<div>
However, once Internet connectivity is "mission critical", you may need to consider a backup connectivity option. Certainly, if you're doing "work from home", this sort of consideration is "table stakes", in my opinion. You may like to ask your employer about subsidising your "work" internet connectivity!</div>
<div>
<br /></div>
<h3>
House Rules</h3>
<div>
This isn't really IT related, but is in that humanities and social science axis that many IT people stereotypically lack skill in - you probably need to negotiate or lay down some ground rules about appropriate internet use, and if any particular uses supercede others, and if that is time-bound. It can make a difference! </div>
<div>
<br /></div>
<div>
A good example might be that it's not acceptable to be watching 4K Netflix while someone else is on an important business videoconference, particularly if your Wi-Fi is not up to the job, or your Internet connection is stretched doing both at the same time. If the only acceptable performance is in the living room, others may need to get out whilst vital business activities take place - in such a case, you're going to want to find ways of improving this situation, if you can.</div>
<div>
<br /></div>
<div>
One thing people with kids at home may like to consider is the number of devices they like to (attempt to) use at the same time. A few times, I took a stroll through boarding houses at a school I worked at. I noticed a number of kids each time attempting to use more than one device at a time. I'm not entirely sure how their brains were supposed to cope with this, but they were literally trying to stream HD or 4K youtube on one device (this is my radio, Sir! [then why in 4K, I ask myself...]), play an online game on another, and streaming something else or chatting with others on a third. During prep (homework) time. Madness (or Gen Z superpower?). Also, a huge waste of bandwidth and wireless capacity. Talk your kids into only doing one thing at once on a single device (and perhaps when are appropriate times to do various activities, and what stream quality they should be using). A more draconian option may be to reserve capacity by changing wireless passwords between "work time" and "play time", although there is administrative overhead to doing that, and social fall-out!<br />
<br />
I'm not sure if this is harder in a household of parents-with-kids or a household of adults with different priorities, or where perhaps landlords do not lay on enough bandwidth for the number of people living there.<br />
<br />
Also, remember that sound travels. Virtual pubs until 2am with your mates may be hilarious for you, and a welcome respite for the extroverts trapped in their own home, but those next door may not find it quite as amusing if their bedtime was several hours ago, and you're keeping them awake. More seriously, if you have a loud voice and teleconference a lot, see if there is a way you can modulate your own voice down a bit - it can be very trying for those around you used to / needing quiet workspaces. Headset microphones allow you to talk quite quietly and still be very audible to those listening to you.<br />
Similarly, the bass of loud music is not great. Whomever around here has been playing what sounds like reggae from the bassline for the past week at high volume from some distance away but still loud enough to be annoying AF really needs to stop. Is it murder, manslaughter, or simply public order policing if they are silenced permanently? Asking for a friend......!</div>
<div>
<br /></div>
<h3>
Latency</h3>
<div>
Sometimes, "slowness" isn't because your bandwidth is limited, or you have poor Wi-Fi. Sometimes things just ARE slow. There is a lag between you starting a request and that request doing the round trip from your device, across your network, across the Internet to the service's servers - and back again with the data you wanted. Sometimes this is on the other side of the planet, and that takes a while. If you use satellite internet, it's a REALLY long round trip! You can often use tools like ping to investigate latency to particular sites or services. Online gamers are particularly concerned with latency.<br />
<br />
Latency comes from two major sources - first, distance. If you are 6,000 kilometers away from something, even at the speed of light in glass, it takes some time to get there and back again (about 1 millisecond per 100km) - and you might notice this. There is no way to cheat this universal speed limit, other than physically moving closer. However, sometimes you will experience unexpected latency increases - depending on where the latency shoots up, you may need to enlist the support structures of your ISP or the end service. Secondly, routers or servers can get busy and delay processing or transferring the data - or they might simply be overloaded. </div>
<div>
<br /></div>
<div>
Due to several factors around how Wi-Fi works, it's more prone to latency and jitter (variable latency) than wired connectivity - but it's usually only in the order of a few milliseconds extra, and not all that relevant, but in "bad"Wi-Fi environments, you may see it shoot up a lot (usually with associated packet loss). </div>
<div>
<br /></div>
<div>
A combination of ping and traceroute are useful to pin down the problems. Be warned, this is quite technical and will often cause people to ask you a lot of technical questions. For non-technical users, it's often better to hope latency problems just go away. </div>
<div>
<br /></div>
<h3>
Packet Loss </h3>
<div>
Information travels across the Internet in little bundles called "packets". Sometimes, these get lost in transit somewhere. There is not much you can do about packets that get lost on the broader internet (although if you can gather evidence of where the problem is you may be able to get your ISP's network engineers to look into it in cooperation with others). But you can probably do things about packet loss between your device and your local Access Point or router. Wireless packet loss is almost always associated with poor signal strength. If you experience high packet loss, improve the signal strength (or go wired!). This may mean moving the device, router / access point, or both. </div>
<div>
<ul>
<li>If only one single site or service is affected, it's almost certainly a problem with that site or service - ask your ISP to work with that service to resolve. </li>
<li>If moving the device right next to the wireless hotspot doesn't help, and it affects all sites and services, then it's probably the device itself that is faulty. </li>
<li>If all devices on your network behave like this, it may be the router/access point or internet connection that is at fault - in these latter cases, seek the assistance of your ISP or friendly local IT technician to diagnose and resolve (remembering social distancing!). </li>
</ul>
</div>
<div>
The ping and traceroute utilities can help to illustrate packet loss and jitter, as can some broadband speed tests. </div>
<div>
<br /></div>
<h3>
Security</h3>
<div>
One thing you should also think about (always) with IT stuff is security of your information. You should consider whether sending information over a wireless link is "safe". There was a time where you could use a browser plugin to intercept people's logins (remember firesheep, anyone?), particularly on "open" wireless networks in coffee shops and airports. Make sure your home network isn't "Open", and is using one of the stronger encryption schemes. </div>
<div>
<br /></div>
<div>
Fortunately, most sites and services have moved over to using end-to-end encryption for sensitive information, which helps keep things secure, even over less secure networks. That doesn't mean you shouldn't pursue "defense in depth" and ensure your home Wi-Fi is secure! You should get into the habit of considering whether or not an activity is safe in each location (and network) you do it on. Security advice abounds on the Internet, so read a good guide on this. <a href="https://heimdalsecurity.com/blog/home-wireless-network-security/">https://heimdalsecurity.com/blog/home-wireless-network-security/</a> is a reasonably good start for home Wi-Fi. </div>
<div>
<br /></div>
<div>
If you are doing work-from-home over your home internet connection and network, lean on your IT team and politely ask them for any tips or requirements they may have to ensure your work data are secure, and you're meeting legal or regulatory requirements, particularly in heavily regulated industries that have above-normal requirements (finance, legal, healthcare, education are common examples). Proactive businesses will have already issued this information, and they may have already given you training on these topics. They may require you to use a VPN. Make sure you don't do very bandwidth consuming things like streaming Netflix, YouTube or Spotify over the VPN connection (indeed, avoid doing anything "personal" and/or non-work related on a work issued device or account, as a general best practice). </div>
<div>
<br /></div>
<h2>
Things to avoid</h2>
<div>
There are some things you should avoid doing, either in general, or during the present pandemic...</div>
<div>
<br /></div>
<h3>
Wireless range extenders</h3>
<div>
Most people think a "wireless range extender" sounds like a good idea. It certainly sounds attractive. However, the way they work is doomed to poor throughput rates, and if you have marginal wireless connectivity rarely makes much improvement, if any. My professional advice is to avoid these like COVID-19.</div>
<div>
<br /></div>
<h3>
Going out to shops to buy more gear</h3>
<div>
It seems "essential" to fix broken wireless, I totally get that. But it is <b>not</b> worth the combined social risk to physically go out and buy new tech toys right now, even if you are allowed to. If you need to make things better, focus on doing so with what you have to hand, or changing behaviour to make optimal use of what is available. If you believe it is appropriate to make use of services that deliver things, it may be worth doing that, in a pinch - but do spare a thought for the risks you're subjecting those in the supply chain to (and consider if you need to sterilise the packaging)!</div>
<div>
<br /></div>
<h3>
Breaking lease conditions</h3>
<div>
If your lease doesn't allow you to nail things in or drill holes, don't do it! Stick to methods that are within what your lease allows. </div>
<div>
<br /></div>
<h3>
Blaming neighbours for the shortcomings of Wi-Fi standards and the use conditions of ISM bands</h3>
<div>
It's probably tempting to blame your neighbours for bad Wi-Fi if there are other networks around making things hard for you. I can assure you that you can have bad Wi-Fi in a farmhouse many miles / kilometers away from other people. There is no law or regulation that says your neighbours have to cooperate with you, or not interfere with your Wi-Fi - indeed, the regulations typically say that as long as all the equipment is operating correctly and within the required <a href="https://en.wikipedia.org/wiki/Effective_radiated_power">EIRP</a> values, you all have to deal with it! Seek, but do not demand, cooperation, if possible. </div>
<div>
<br /></div>
<div>
The only way to deal with really crowded spectrum is to get closer to access points, and try to move things into other frequency bands that other people aren't using - narrow 5.8GHz frequency bands give you a lot of scope for interference-free coverage, but most home routers default to really wide channel widths for "performance". If you don't need 300 megabits per second between your device and the access point / router, then don't waste spectrum on being able to achieve that! There is a delicate balancing act between channel allocations and performance, to the point Wi-Fi is an increasingly important specialisation in IT. </div>
<div>
<br /></div>
<div>
Be the good person in your neighbourhood in these matters and seek to optimise for all. </div>
<div>
<br /></div>
<h2>
Conclusion</h2>
<div>
Thanks for reading this guide - I hope you get some useful tips from it and manage to "un-suck" your home Wi-Fi. I'll try to monitor the comments in case things are unclear or you need some help, but there is a limit to how much time I can spend attending to individual questions! :) </div>
<div>
<br /></div>
<div>
Stay safe out there, help others responsibly, and do whatever you can to respect and help support social distancing, isolation and quarantine efforts. </div>
<div>
<br /></div>
<h2>
Further reading</h2>
<div>
<ul style="text-align: left;">
<li><a href="https://www.metageek.com/work-from-home-wifi/">Metageek's Work from Home Wi-Fi guide</a></li>
<li><a href="https://www.metageek.com/training/resources/design-dual-band-wifi.html">Metageek's dual channel design guide</a></li>
<li>Goodness, this is interesting! I want a professional level understanding of Wireless. </li>
<ul>
<li>Read the <a href="https://www.wiley.com/en-gb/CWNA+Certified+Wireless+Network+Administrator+Study+Guide:+Exam+CWNA+107,+5th+Edition-p-9781119477501">CWNA study guide</a> - you can also find it on Amazon in kindle or hardcopy. I'll warn you now, it's tough going, and memorably rates in parts as the hardest slog of any technical book I've ever read (and I read a lot of technical books, and usually enjoy it). </li></ul><li>I want to see how some of this stuff affects networks in the real world</li><ul><li>Ars has an interesting article that will make you learn a lot of useful Wi-Fi pain points very quickly - if you understand the bits of the article before they get to the programs they use to test, you'll have upped your Wi-Fi knowledge completely! It will certainly help you pick your next router, and help you to understand why I've made some of the suggestions above. <a href="https://arstechnica.com/gadgets/2020/01/how-ars-tests-wi-fi-gear-and-you-can-too/">https://arstechnica.com/gadgets/2020/01/how-ars-tests-wi-fi-gear-and-you-can-too/</a></li>
</ul>
</ul>
</div>
<div>
<br /></div>
<div>
<i>Footnote: </i></div>
<div>
<span style="font-size: xx-small;"><br /></span></div>
<div>
<span style="font-size: xx-small;">Some of you may recognise (many) of the section headings. Yes, I've indeed amused myself by applying some of the <a href="https://www.axelos.com/welcome-to-itil-4">ITIL v4</a> guiding principles as headings, with apologies to Axelos. As hinted <a href="https://schoolsysadmin.blogspot.com/2020/02/funemployment.html">in a previous post</a>, I'm doing some certifications, and passed the Foundation exam over the weekend, so these phrases are bouncing around my head at the moment. </span></div>
<span><!--more--></span><span><!--more--></span><span><!--more--></span><span><!--more--></span>James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com0tag:blogger.com,1999:blog-5949954951402579361.post-12079310705569074302020-02-27T22:29:00.009+00:002020-08-28T11:39:24.332+01:00FunemploymentMy wife and I recently moved continents, and I have now joined the ranks of the funemployed, because (for our family) moving was more important that moving <i>to a job</i>. I'm certainly going to miss my old job (because, aside from being given billions of dollars and told to go explore the oceans, "Network Architect" is exactly where I want to be, and I worked with great people and did fun things!).<br />
<br />
Read on for some thoughts on how to make the most of this process, and perhaps some ideas of what to do when you move vast distances...<br />
<br />
<a name='more'></a><h2>
Stranger in a Strange Land</h2>
<br />
I (more or less) grew up in the country and city we've just moved to, but after spending 20-something years in another (and not visiting this one in about 15), it's not yet "really" home - although the culture isn't particularly alien, and the words and accents are familiar. Of course, as a <a href="https://en.wikipedia.org/wiki/Third_culture_kid">third culture kid</a>, I assimilate accents quite fast, and so mine has a certain South African twang to it now. I imagine it will lessen as I spend time outside the house. The biggest change I've noticed is the soundscape of the city has changed - with the move towards more efficient buses and taxis, the old thrum of diesel has left, and something about the city feels different as a result (not to mention the changes to the skyline). I forgot how screamingly loud the underground was, or perhaps it's because I'm using different lines than I used to (or perhaps the lines and rolling-stock have aged and gotten worse...). On the plus side, Internet is fast (sadly DOCSIS rather than fibre in this area) and cellular communications are cheap (as are many tech toys), the power mostly stays on, water reliably comes out of the taps, and gas is conveniently piped to your home - and, despite the circus show of politics, things mostly just<i> work</i>. I don't think I've quite caught up with the change yet - I think my body thinks it is on some kind of holiday!<br />
<br />
<h2>
On using the time</h2>
Most ICT professionals bemoan the amount of time they'd like to spend learning new things, but that they don't have the time. Well, when you join the ranks of the funemployed, time is very much a thing you now have - indeed, it's your main resource! Depending on your financial resources, you've got a few options to make use of it, including hitting the books to get another cert (or three, or more...); enrolling in a course if self-directed learning isn't your thing, or isn't possible through books or research; building and playing in a home lab (virtual or physical); or even enrolling in a more full-time and long term programme of studies (like a degree).<br />
<br />
Before you do any of that, it's probably worth doing two things - one, thinking about what you would *like* to learn, and two, comparing that with what the job market seems to want in roles you are interested in applying for. Let those factors guide you in selecting a sensible route forward. It's also perhaps a useful period of time to take stock and build up a plan of where you would like to be and what you want to be doing in <i>n</i> years time. I'm incredibly fortunate that my wife is gainfully employed already, and that we have enough saved up to weather the financial storm, but you certainly want to maximise your use of the time - and minimise the out of work time, too - unless you have a really good reason to stay out of work a little longer (say a full time course).<br />
<br />
Make sure you take careful stock of your finances, too - you may want to get a particular qualification, but it may simply be unaffordable. Perhaps, if the permanent job market is dire, you could intersperse project consultancy work with learning, or go down a few grades in the tech work you usually do. Be wary of loans, credit cards, and other forms of debt! Hopefully, you have planned the move and have some savings; you may need to compromise on things like housing (i.e. downsize or live in a less salubrious neighbourhood) to be able to afford rent in a new city when only one family member is employed. You'll figure it out!<br />
<br />
Also think of what the industry you're in will look like in <i>n</i> years time. If you're in systems or network administration, and you're not honing your skills in scripting, some light programming in at least one major language and various automation platforms (think <a href="https://nrelabs.io/2019/10/network-automation-certifications/">devops or devnetops</a>) you're going to become irrelevant outside of very small organisations. Put some cloud learning in there too, if you haven't already - and spend some time thinking carefully about infosec. If you manually do repetitive things, figure out how to stop doing that real soon now. Reflect on experiences from where you've worked before - what steps should a customer support person be able to do rather than calling on a sysadmin or network engineer? How could you implement that effectively, and safely (in terms of both "blast radius" if it goes wrong, and in terms of information security)? How could you have automated or "infrastructure as code"-d your environment? How can you improve your own efficiency or effectiveness? Better prepare for disaster recovery? Harden your environment against attack? Whilst there are guidelines on how much time you should spend automating tasks (e.g. <a href="https://xkcd.com/1205/">https://xkcd.com/1205/</a>), you should recognise that when you first start doing this you are gaining more than just time-saving - you are gaining experience and competence (perhaps ultimately even expertise or hopefully mastery) in automating things, so don't discount working on mini-projects that might be "educational" more than useful.<br />
<br />
Do a lot of critical self-reflection. Do you like large or small organisations? Are there industries that particularly appeal? Are you more of a specialist or more of a generalist? What parts of jobs do you love (and hate)? What am I good (and bad) at? Strengths and weaknesses?<br />
Why?<br />
<br />
Obviously, a home lab to play with learning or honing skills is ideal, but you may be able to find other resources to plug some of the gaps, and your home lab setup might not be a dream setup - remember that if you can learn in it, it's good enough. Whilst learning within the specific environment you want to be working in is ideal, you may not be able to afford the gear. See if there are virtualised platforms you could use instead, or recognise that sometimes learning the underlying concept is more important than learning the exact sequence of magic words to type in to achieve it on a specific platform - if you have a rock-solid understanding of how e.g. BGP or OSPF is supposed to work, it's easy to apply that to any platform. Leverage FOSS! Mikrotik make very inexpensive routers if you want to learn more about routing, but you may also find network simulators like <a href="https://www.eve-ng.net/">EVE-NG</a> or <a href="https://www.gns3.com/">GNS3</a> useful (if you can run them on hardware you happen to have). Raspberry Pis make reasonable basic Linux servers. Look out for vendor training platforms; Juniper for instance offer a lot of learning resources freely, free virtualised lab environment (<a href="https://jlabs.juniper.net/vlabs/">vlabs</a>), and support the <a href="https://nrelabs.io/">NRE Labs</a> site. <a href="https://www.virtualbox.org/">Virtualbox</a> will let you run VMs on your windows machine very easily. I've written two posts on learning - <a href="https://schoolsysadmin.blogspot.com/2020/07/dunning-kruger-and-learning.html"><i>Dunning-Kruger and Learning</i></a> and <i><a href="https://schoolsysadmin.blogspot.com/2020/07/read-it-note-it-redo-it-teach-it-how-to.html">Read it, note it, (re)do it, teach IT: How to learn effectively</a></i>.<br />
<br />
Once you've figured out what you want to do and how to do it, make sure you draw up a timetable or other framework to ensure you stay on track and on focus; as with remote workers, it helps to do the personal hygiene stuff, put on pants every day and have (if possible) a dedicated "work" area. If you struggle with internal motivation or are easily distracted, figure out some way of ensuring accountability, perhaps including "rewards" for good behaviour!<br />
<br />
As finding a job in a new market can take some time, make sure you block out sufficient time to actively job-hunt or otherwise meet non-learning related goals to getting work, unless you've agreed with your family that you're taking X months out to focus on learning new skills or gaining new qualifications - even then, it's worth keeping an eye open for any particularly interesting work opportunities.<br />
<br />
Remember as you gain new qualifications, skills and experience in your "productive downtime" that you'll need to add them to your CV, mention them in covering letters, LinkedIn profile, or even use them in your interviews. <br />
<br />
If you're part of a family, make sure you do enough of the household chore heavy lifting - but don't make it the only thing you do in a day, and try to ensure your "quiet time" to get job hunting or learning done can be respected. Make sure you regularly talk things through with your partner, as well (note to self...). Periods out of work (on top of moving, one of the top 10 most stressful things you can do, apparently) can cause considerable stress, and even frayed tempers. Don't let poor communication make that worse!<div><br /></div><div>You may want to (indeed I strongly recommend that you do) set aside some time to put together some sort of career plan, even in a vague outline - where do you want to go, and what will it take to get there? If you need some inspiration, there is a series of articles on this exact topic on Wendell Odom's blog; have a look at the career planning posts from 2015 at: <a href="https://blog.certskills.com/category/general/career/">https://blog.certskills.com/category/general/career/</a><br />
<br />
Bottom line - make sure you plan things, speak to your people, and don't waft around aimlessly!<br />
<br />
<br />
<h2>
On the job hunt</h2>
Another thing you ought to do (particularly if you feel you're already well stocked up on certs, skills and experience) is apply for work.<br />
<br />
The best routes to do this are obviously industry and market-specific, but the usual advice to polish up your CV, LinkedIn profile, and cover letters is a good start (as is taking the time to tailor them for each specific position). You may also be in a market or industry where recruiters play a big role in finding people. It may be worth setting up an appointment with a few that work in the industries you like and with the skills you bring to the table - so they get a feel for you and your strengths, and get a feel for what you might offer to their customers. Some will give you assistance with applications, like polishing your CV or cover letter or doing some interview preparation work (as it's in their interest to get candidates they put forward employed). Treat such meetings as important job interviews - the good recruiter is definitely trying to suss you out as a possible hire for their clients. Such relationships may also bear fruit over several years.<br />
<br />
If you move into very different job markets, you will also probably have to take one or more of a pay or seniority cut; if, like me, you've moved from a small town in a developing world country where technical skills are rare and valuable, you may have just entered the "big leagues", and have <i>much</i> more competition for roles. If you do have to make that trade-off, make sure you have a plan to get your career back "on track" as soon as possible. There's probably some careful balancing act to be undertaken between accepting "too junior" a role and being out of work "too long" by reaching too high. A strategy there might be to start where you think you should be, see if anyone bites in a reasonable time-frame, and if not, start moving down the hierarchy/experience/skills ladder. It is, of course, sometimes just a "waiting game" - but if you're not getting invited to interview, then certainly that should be an indication that your CV needs work (not necessarily only in terms of format, but in terms of applicable content - which might mean you're not experienced enough or lack something they're looking for).<br />
<br />
Be certain to have some lion-taming stories prepared for your interviews. By that, I mean good answers to questions like "what's the most complicated X you've ever fixed / installed / architected / diagnosed / broken"! Reflect on key requirements or skills listed by each position and have stories about how you've tackled similar (or at least vaguely related) challenges and technologies. People like an engaging story - obviously, you don't want to lie or embellish the truth, but make sure you have a logical and concise structure to your story as well (like <a href="https://www.themuse.com/advice/star-interview-method">STAR</a>). Interviewers are generally trying to answer two over-riding questions: 1) can they do this job? (mostly answered by the CV already) 2) will they fit in and be good colleagues? It really helps to have thought about potential questions BEFORE the pressure of the interview is on, because there is always a risk on the day you'll go blank and look like an idiot. Interviews are widely acknowledged to be pretty terrible ways of assessing people, but you might as well "hack" them to perform as well as you can in them, so you can go on to do the same in the actual role! Don't be afraid to pause for a moment to think about how best to answer a question - it's better to structure your thoughts and have a focused, concise and good answer rather than just rambling at the interview panel, terrified of silence... Spend a bit of time learning more about good CVs and interview techniques if you haven't done so. If you can avoid telephonic or "skype" interviews, do so. I've had some horrible experiences with them, like the time the other end's echo cancelling system just screamed in my ear every time I tried to talk - incredibly off-putting and I did not perform terribly well as a result. Body language and in person conversations are also just much better. Treat each interview as a learning exercise, too - spend some time after each one reviewing what you felt was good and bad about your own performance, and how you could improve next time. If you're like me, you'll spend hours kicking yourself about all the ways you could have answered the questions better - so put that work in ahead of time (note to self?). I've written <a href="https://schoolsysadmin.blogspot.com/2020/08/interview-job-application-preparation.html">a longer post specifically on applications and interviews</a>. <br />
<br />
Also, check if you need some new clothes for interviews; depending on the industry and market, you may need a range from very formal to "Silicon Valley Smart" (you know, your good hoodie, clean jeans and shiny sneakers!). Whilst many of us hardly take a second glance at clothing and shoes, this is not the case of all people, and particularly, non-tech people on panels. I was devastated when I had to stop wearing baggy shorts and t-shirts and move to chinos and a collared shirt, but now I'm used to it. I also don't hate a suit, but I prefer less formal (I've also spent a lot of time crawling around in ceilings, where suits don't make a lot of sense). Don't let uncomfortable shoes or clothes put you off your game. Bottom line here - look the part you're supposed to play! Don't be afraid to call for advice here if style/fashion is something way outside your comfort zone. If you can find workplace pictures that aren't stock photos, they may clue you in on expected dress; if you've moved country or even city (or industry) you may find dress norms different from what you've done before; it's usually appropriate to err marginally on the side of too formal, unless that flies grossly against the workplace culture.<br />
<br />
Be at least a little flexible on things like which industry or position you'd like to work in - by all means, start by choosing ones you prefer, but if that doesn't work out, try another; try a lower job grade position if you need to. Once you get employed in the country/city, it should be easier to get employed somewhere else - and a job will pay the bills and help you gain further experience until you can find something better - if for some reason that initial position doesn't fill you with joy! Likewise, whilst we'd all prefer "permanent" employment, some part-time or consultancy work isn't necessarily a bad move.</div><div>
<br />
<h2>
Networking is not just for packets and frames...</h2>
It's an open secret that typically the easiest way to get hired is if someone on the hiring committee already knows you and your work. Thinking back to all the jobs I've been offered in the past, I have no doubt whatsoever that people knowing me has helped me - indeed, I was more or less "head-hunted" for the last two positions I filled, and the two before that, people wanted me to work for them, too. I now completely lack that "social capital" - I'm sure it's slightly less powerful in a megacity compared with in a relatively small town, but I imagine it still has its place.<br />
<br />
So how can you develop social capital, or a close analogue?<br />
There are three fairly obvious avenues:<br />
a) online presence (not helpful in all industries) - establish yourself as a "thought leader" or dazzle people with your brilliance in appropriate venues (code in github, blogs, LinkedIn articles, videos, articles in trade magazines or whatever works in your sector and area and plays to your skills). This can help get you known.<br />
b) meet and impress industry-specific recruiters.<br />
c) network with peers.<br />
<br />
Unless you directly work in online marketing or are looking to get into technical article writing, the second two are likely to be more effective. In particular when you move to larger cities, you will find that there are various fora you can attend to meet other people with similar (professional) interests to your own. With my interest and experience (and desire to continue working) in networking, meetups like <a href="https://netldn.uk/">NetLDN</a> or <a href="https://www.netmcr.uk/">NetMcr</a> make eminent sense. <a href="https://www.emfcamp.org/">EMFCamp</a> looks interesting! I'm an odd mix of sociable but shy - I've learnt I enjoy being around other people, but I am hesitant to step out of my comfort zone and actively meet and interact with new people. If you're anything like me, you need to seize that particular bull by its horns, and get out there! If (or when) you meet people that you think are doing the sorts of job you'd like to be doing, it may be possible to ask them their thoughts on what it takes to get there. Even just "hanging out and listening" can lead to some really useful insights. You might also politely ask if there are any job leads or trusted recruiters they might suggest, or general insights into the job market. There may also be hacker- or makerspaces, or conferences you can attend. Often the most useful parts of conferences are the networking opportunities outside of the scheduled events. You may find reconnecting with friends and extended family, or meetings friends of friends (or family), yields further useful insights or leads. You may even want to spend some of your time volunteering for NGOs or CBOs - many could really do with even just a few hours of skilled systems administration, although a longer term commitment is usually better; as well as the good cause, you'll further hone your skills and may even get some good references or even job leads out of it! If you parted on good terms with your last employer, they may even welcome paying you for some of your time. </div><div>
<br />
If you're in the same position as me, good luck out there. There will probably be updates as I navigate these waters myself! :)</div>James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com0tag:blogger.com,1999:blog-5949954951402579361.post-3285210466402893712020-02-01T08:32:00.012+00:002020-10-01T16:29:23.902+01:00Trust Boundaries and Reliable Backups: Ransomware EditionA network whose administrators I know quite well has been thoroughly compromised and critical files encrypted, and much configuration destroyed. Even their backups (such as they were) are no more.<br />
<br />
This is, to put it mildly, a fairly catastrophic incident for any organisation.<br />
<br />
We turned our minds to the issue and thought about how we can prevent similar things happening to us...<br />
<br />
<a name='more'></a><h2>
The Compromise :(</h2>
<br />
As far as they can tell, an Internet-facing RDP endpoint was compromised, and then the attackers moved laterally (probably under interactive human control) throughout the network through most vulnerable systems, deleting or modifying configurations, and trashing or encrypting files and data. Powershell scripts were part of it. In other words, their <i>entire</i> environment and<b> </b>almost <b>all </b>their data was pretty comprehensively pwned - every IT person's worst nightmare. Somewhere, there was a message about sending bitcoins to somewhere, and you'll get your files back...<br />
<br />
One of my acquaintances there warned their management about pretty every single thing that was compromised and why they needed controls x, practices y, and resources z. These were dismissed. Fortunately, for them, this was all in writing. That doesn't help the organisation one bit right now, but you can bet the esteem in which this person's opinions will be held in future just shot up stratospherically within that organisation.<br />
<br />
Over time (<i>weeks!</i>), they've gotten some of their things back through some herculean efforts. A last-ditch effort is to send some hard drives off to a data recovery specialist firm. Some things will, in the end, just be gone.<br />
<br />
So what can we do to prevent something like that happening to us?!<br />
<br />
<h2>
Who do you Trust?</h2>
There were are few areas where "best practice" was not followed (in some cases, perhaps for "operational reasons"; one doesn't wish to pry!).<br />
<br />
An internet-facing RDP host is generally going to be a bad idea. An unpatched RDP host is a <i>really</i> bad idea. A reasonably secured (and patched...) VPN should probably be put between it and the Internet if there are sound operational reasons why RDP access is required remotely, if possible with even the VPN restricted by a firewall to reasonably plausible source IP addresses. Usually, this sort of thing is not done because a vendor claims it causes problems - or, more likely, push-back from non-IT staff that finds it "makes their job <i>impossible</i> to have to use the VPN <i>as well</i>". You should want to harden that host (and of course your VPN) as much as possible - there are usually decent guides to be found online to harden the configuration of virtually everything (e.g. <a href="https://blog.malwarebytes.com/security-world/business-security-world/2018/08/protect-rdp-access-ransomware-attacks/">RDP</a>). As painful as many users find MFA, that certainly has a role. But don't do it <a href="https://twitter.com/PREAUX_FISH/status/1221922997482582016">like this</a>. Obviously, baseline best practices such as sane separation of privileges, running under "least privilege" and of course basic sane firewall rule-set - and patching systems - are going nowhere, and need to be a key part of what good IT teams do all the time. Most hosts in their default state are not great - but most vendors (or helpful blog authors) offer better, more secure suggested configurations for most systems. For all configurations that are "necessarily risky" make sure the organisation, and their auditors, know about and accept the risk. "Unnecessarily risky" configurations MUST be dealt with.<br />
<br />
Another issue is that credentials were shared between systems and across (what should be) trust boundaries - identical credentials were shared between administrative accounts in Windows, and a FreeNAS system being used as an iSCSI target for Windows DPM-based backups. It should be obvious that the change of operating system then does virtually nothing to help prevent compromise of the backup files. Once one understands that attackers can dump credentials, abuse session tokens, or can have a great deal of fun once they compromise a domain controller, you're going to want to think about how you can limit the "blast radius" of each potential compromise. If your AD Domain is compromised, it's vital that your backups aren't. This then brings us to Trust Boundaries...<br />
<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhusJaJwBPtQZP0NUwHWgRtyv3u4LVRaLaretc0885kbuuBuN-qHv_GptC4Jt5SqR0DrsBVOSuUIjt7kREXJ3f_8JCvhAuyxUESQV0w76xTE_hItagQFhbk_J68pFQHqoSvqPNxMuCXHoo/s2048/bernard-hermant-OLLtavHHBKg-unsplash.jpg" style="margin-left: auto; margin-right: auto; text-align: center;"><img alt="Fake road warning sign (red circle) with a "trust" sticker over a silhouette of a person." border="0" data-original-height="2048" data-original-width="1903" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhusJaJwBPtQZP0NUwHWgRtyv3u4LVRaLaretc0885kbuuBuN-qHv_GptC4Jt5SqR0DrsBVOSuUIjt7kREXJ3f_8JCvhAuyxUESQV0w76xTE_hItagQFhbk_J68pFQHqoSvqPNxMuCXHoo/w297-h320/bernard-hermant-OLLtavHHBKg-unsplash.jpg" width="297" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Where are YOUR trust boundaries?<br /><i>Photo by @bernardhermant on <a href="https://unsplash.com/photos/OLLtavHHBKg">Unsplash</a></i></td></tr></tbody></table><br /><h2>Trust Boundaries</h2>
<div>
It's pretty convenient to have a single account to do everything. </div>
<div>
<br /></div>
<div>
It's not a good idea. </div>
<div>
<br /></div>
<div>
I spend a non-zero part of my day fetching credentials out of password safes - either my personal ones, or the network engineering one. My workstation account is admin on that machine, but it's not admin anywhere else (plus we don't use Active Directory, so...). I have different credentials to log into my laptop vs my general central user account. Many systems have a separate login to that. Most systems require me to get a root password to do anything "dangerous". There's still a non-zero risk of compromise for a determined enough adversary (dump a clipboard buffer, or some in-memory attack, perhaps) - my personal password safe empties the clipboard after a set interval, and locks itself again. The central one requires interrogation over SSH with a credential each time - of course, then there's the plaintext on screen, so shoulder-surfing is a risk, as are long-running sessions that aren't cleared in terms of local attackers (and hey, the .bash_history file might be quite interesting - but we've disabled that feature, and there are various additional protections there too). We've split one huge password safe into several smaller, role-specific ones. None of them would end in a good day if someone were pwn them, but it's a (very) carefully considered balance of good passwords and workable usability for a sysadmin/neteng ops team. What risks does Single Sign-On (via SAML, or anything else) bring? Are those self-signed certificates and organisation-wide installation of the root certificate a good idea? </div>
<div>
<br /></div>
<div>
A very useful feature is the ability to rapidly and comprehensively change privileged credentials across an organisation. It's not always easy, but it's a worthy project to work out when people have done silly things ("I'll use my Domain Admin account to allow this random service process to Run As me...) and fix it. This falls into the realm of Identity Management (IdM) and related systems and tooling. What do you do when a sysadmin (or someone with "sysadmin level access", even if their job title isn't sysadmin) leaves your team, regardless of whether it's amicable or not...?</div>
<div>
<br /></div>
<div>
Certainly, best practice suggests that accounts used for day-to-day computing, and those used for privileged tasks, ought not to be the same (and sysadmins running as unprivileged users helps you understand your users's experience better!). The less useful a credential is, the better. In some environments, even using Windows Domain Admin credentials on untrusted hosts (or hosts used for non-sysadmin tasks) may be a bad idea. </div>
<div>
<br /></div>
<div>
If you've not done it, think about your infrastructure and whether a single credential works everywhere to do <i>everything</i> - and if there are areas where certain credentials<i> should not</i> work, and certain boundaries beyond which they MUST NOT work. Having to regularly use different credentials is, in the long run, a lot better than a serious system compromise. </div>
<div>
<br /></div>
<div>
Bastion hosts or otherwise specially hardened and configured hosts (like <a href="https://www.microsoft.com/en-us/itshowcase/protecting-high-risk-environments-with-secure-admin-workstations">SAWs</a>) may be the most appropriate place from whence to conduct sysadmin tasks (or anything requiring elevated permissions) - and whether or not those should allow for any form of remote access should be very carefully considered. It's typically considerably less of a pain to drive across town at 3AM to fix a glitch than to spend weeks recovering from full compromise - <i>if you ever can</i>. How much do you "trust" hosts (or users, or applications) in various areas of your infrastructure? Are there clear boundaries between them (or some of them) where you could delineate, enforce and protect a <a href="https://en.wikipedia.org/wiki/Trust_boundary">trust boundary</a>? Can you separate out the privileges needed to administer a desktop system to your server infrastructure (i.e. does desktop support require Domain Admin? Probably not!). This means you will end up with personal username/password combinations for various different roles you may fill as a sysadmin. Figure out how to not let this get in the way too much (and no, the answer certainly isn't "password re-use"!). Can your privileged management networks be accessed from places you would prefer them not to be? Another related idea is that you should only ever go down trust levels - manage from the most trustworthy platform towards the least - and NEVER in the opposite direction (i.e. establishing an administrator-level connection from an untrusted host to a trusted one is a bad idea - like perhaps an RDP session from a client desktop PC back to a Domain Controller server to "quickly change a setting/check something").<br />
<br />
A significant challenge is always SMMEs - they don't have the budget for 24/7/365 operations/security team(s) nor large enough teams across time zones that someone is always in the office. You're going to have to balance being the person who drives across town at 3AM vs privileged remote access - or a management approved "sorry, we're not fixing that right now" SLA. Distributed teams with remote workers can also certainly make the challenge greater. Make sure your alerting systems follow your SLA - if you're not expected to fix it at 3AM, make sure it's not waking you up at 3AM - likewise, make sure, if you have on-call rotations, that the notifications don't go to people not on call outside of their working hours. You're going to need that good night's sleep in the morning to clean up the mess! I sometimes wonder if companies ought to provide an allowance to get critical staff to live closer to the office (because rents are usually higher closer to the office in the middle of a city) - or work like many hospitals, with a dorm room for the on call medical staff. From the perspective of keeping things safe, things that are turned off or not connected to a network (airgapped) are low risk from network-based compromise - but don't forget about physical risks like disasters, theft and insider threats from employees or even customers on your premises. Watch out for <a href="https://elie.net/blog/security/concerns-about-usb-security-are-real-48-percent-of-people-do-plug-in-usb-drives-found-in-parking-lots/">freebie removable media</a> - and even <a href="https://shop.hak5.org/products/o-mg-cable">interface cables</a>. You HAVE to spend some time training your employees/colleagues about these risks.<br />
<br />
Sometimes, <a href="https://www.wired.com/story/notpetya-cyberattack-ukraine-russia-code-crashed-the-world/">accidental trust boundaries save your bacon</a> (flaky African power saves a global supergiant). This might also be a random snapshot that was left somewhere, or a copy of a server running on some dev's laptop - that's extremely lucky, not a strategy. Make sure you make<i> intentional</i> trust boundaries in the right places, and carefully consider worst case scenarios and what the "unthinkable" might be!</div>
<div>
<br /></div>
<div>
The best trust boundary, of course, is an airgap - (tested) WORM media offline backups, perhaps on a different physical storage medium type (hard drive vs tape vs optical, etc), probably in a different location, are the "gold standard" for backup and disaster recovery for a reason. Remember not all offsite locations are created equal - in a SMME, the IT manager or CEO's house might be an OK offsite location, but in a highly regulated industry... not so much. In some limited cases, paper may be the most reliable form, even if it's annoying to re-digitize and has its own risks. (We're good at storing paper). In some industries, you may have to consider (very) patient APTs or threat actors in your processes. </div>
<div>
<br /></div>
<h2>
Towards "Zero Trust"? </h2>
There is absolutely no denying that traditional models of Enterprise architecture and security have changed (perhaps beyond recognition) - a hardened perimeter and then very tightly managed hosts inside the organisation have completely vanished under the onslaught of a more mobile workforce with frequent remote working, BYOD, a wider range of devices and operating systems - and of course online B2B, B2C and Cloud infrastructure and processes. People want (and arguably need) access to things 24/7/365 wherever they are on whatever random device they choose to use - how do we adjust our practices to this new paradigm? Can we reasonably limit certain behaviours or practices?<br />
<br />
This then suggests a move towards "<a href="https://www.cloudflare.com/learning/security/glossary/what-is-zero-trust/">zero trust</a>" - don't trust <i>anything</i>. Configure <b>everything</b> as if it were on a publicly routeable IP address directly on the Internet. And then layer your protections with "defense in depth" best practices and norms on top of that. Make sure you can <a href="http://www.opsreportcard.com/section/31">easily disable</a> access for compromised accounts or hosts. Consider how you can harden the human element, which is often the "weakest link". How can you have strong AAA with little inconvenience? What barriers are there to MFA/2FA and what can you do about them? What training do (all) the staff need? Work your way up the OSI model when securing your infrastructure, as the lower layers can irreparably undermine even good efforts in the others. Of course, "<a href="https://en.wikipedia.org/wiki/Layer_8">layer 8</a>" can topple the entire stack. If you're missing skills in your organisation, see if you can hire good outside consultant assistance to help you up your game - and train your people. Try to encourage a learning culture in your team and colleagues. This is normally common in IT, but there are some that don't "naturally" do self-motivated learning.<br />
<br />
I don't know if there is a 1:1 relationship between the sort of people who really OUGHT to be using physical security tokens or other multifactor authentication (MFA), and the likeliness they are to complain (sometimes loudly) it "gets in the way" of their work - but there does seem to be some correlation. If people find security "annoying" you can bet that's their systematic attitude to it - their credentials are probably poor quality, and poorly secured, and re-used everywhere, and - probably - on a post-it on their monitor, or written in their PA's "little black book". If you have enough control over the systems in your organisation, you may be able to be increasingly pedantic about what a user needs to do to prove their rights to do an increasingly "risky" activity; low risks may have low (or even no) AAA requirements, whereas very risky things may require MFA, access from specific hosts, or even time-limited access (time-of-day more than session duration, but requiring re-auth is not a bad thing). Logging (to a hardened syslog server or similar) helps you figure out what went wrong, but it prevents nothing; SIEM and other log-based monitoring and alerting can help you detect an attack - and then thwart it, or at least give you hints as to what went wrong, where and what to do to fix it (but, sadly, not totally prevent it). None of that helps if you have <a href="https://www.atlassian.com/incident-management/on-call/alert-fatigue">alert fatigue</a> in your team and stop paying attention or ignore it (I've personally experienced this with our own monitoring - in other words, I get too many alerts - my colleagues, that often get storms of 300+ low priority [and, to some extent, irrelevant] alerts at a time, must really not notice message alert tones).<br />
<br />
In any case, get (and maintain...) the "basic" hardening, patching, user training / security culture, strong AAA and upgrading obsolete stuff right <i>long</i> before you go down the big league infosec rabbit-hole - so many organisations get that wrong, so you'll be ahead of the pack already - being a hard target tends to help mitigate "script kiddie" drive-bys and low-level intentional attackers (and their bot armies). If you're worried about nation-state actors, well, it's time to look at building out a full, dedicated 24/7/365 infosec team. It's not paranoia if they're out to get you. And "they" are...!<br />
<br />
<h2>
Disaster Recovery Role-Playing</h2>
It may be worth setting aside some time within your team to role-play various disasters, following any disaster recovery plans you already have, and perhaps improving those as you discover gaps. If you don't <i>have</i> a DR plan, write one, and start testing it. Red and Blue teams have effectively made this an "all day, every day" practice, often on live systems, but a "<a href="https://en.wikipedia.org/wiki/Tabletop_role-playing_game">desktop RPG</a>" version is also very useful. (Someone <a href="http://cryptorpg.com/">has written an actual infosec RPG</a>). Make sure you consider how you'll rebuild from scorched earth, and how you can ensure that your recreated environment is "known good" and, to the extent possible, is free of any lingering traces of the attack or catastrophe. What happens when you plug your air-gapped backup into <i>that host there</i> to start recovery....???? Make sure the senior management of the business understands the likely ETR for various disasters, has some understanding of the likelihood of that disaster happening, and that they're ready for the worst to happen and will support the process of clawing everything back, and have already accepted the time it will take and the business impact. They also need to understand if they don't like the ETR, the only solution is (almost always) more resources - often on an exponential curve. "Sure, you can have "nine nines", but it will take 99% of last year's operating profits to reach it" is not a popular answer (particularly when you later hit the outage after spending the money). Iterate through the low-hanging fruit first, before tackling the harder / expensive problems. "Badly crimped cable" is the worst outage reason ever in a world of certified, low cost, moulded cables, relatively cheap access switches and multi-nic machines - look for fixes and preventative steps like that during your team exercises.<br />
<br />
Your organisation gets pwned. How do you recover all the desktop machines? Did people actually follow your instructions to store data in the right place, or did they leave it all on their C drive and now it's gone? <br />
Was there un-encrypted PII on that laptop that was lost - what's the fine going to be, and what do you need to do to report it to the data protection authorities?<br />
All your sysadmin team gets on a single bus that drives off a cliff in dense fog on their way to an off-site DR planning meeting. Now what?<br />
<br />
Overall situational awareness is important, and a "chain of command" can be useful. Make sure you model partial awareness in the players and see what chaos erupts (only the "Dungeon Master" should "see the whole board"!). If your organisation is strongly "silo-d", model that. Then play a round with no silos. Have post-mortem discussions about the results with everyone, and leave them space to have their own "ah-ha!" moments about certain types of change, and where the hidden problems might lie.<br />
<br />
One thing you must do during this is consider when actions destroy evidence - and what you should do about that; destroying evidence is legally un-ideal, but it's operationally very problematic, because once it's gone, it's harder (often impossible) to work out what happened and what was compromised. DFIR is hard.<br />
<br />
<h2>
But I don't have a dedicated security FTE, or a SOC! :( </h2>
That certainly doesn't mean all is lost. Over a certain size, and certainly in regulated industries, management should understand the need for security-focussed staff; if they don't, see the section below. :)<br />
<br />
There are certainly steps you can take - life-long learning is a key part of IT, so up-skill yourself. Work with your colleagues to figure out how to "bake in" security as much as possible in your day-to-day operations, and in projects. Think like an attacker, and if something you do makes your stomach drop when considering "what if that gets pwned?" figure out what the remediation or mitigation is. Mentor and learn from your colleagues! There are people on the Internet that write useful articles about security - read them! Large vendors often have good documentation (why they don't ship their product in the most secure configuration, I'm not sure, but eh, legacy, amIright?). Make sure that you're plugged into relevant security feeds from appropriate vendors, and patch away as appropriate in your environment. When infosec twitter isn't a dumpster fire, it's quite informative.<br />
<br />
Certainly, K-12 schools are unlikely to have anything like the resources to pull off even a tiny fraction of what large enterprises do - but if you cover the basics, and go one or two steps further, you're going to be a<i> lot </i>better off than the average (and often a harder target than a larger enterprise *with* a SOC - because you can often be more in control, and fully understanding and managing fewer things is easier). MFA/2FA can help a LOT where you(r users) are bad at passwords. <a href="https://schoolsysadmin.blogspot.com/2017/06/outgoing-email-security-in-2017-spf.html">Secure your email</a> as much as you can. Train your end users on basic precautions. Use security features, don't turn them off (hello, UAC!). Under GDPR-style legislation, make use of strong encryption (hello, bitlocker) - but make sure you have a way of recovering that data. Iterate - you won't get perfect in one go, and it's a shifting target anyway. Be clear about why spending time on project X instead of Project Y is better (which has the greatest result?) - be very wary of more advanced actions meaning you drop the ball on the basics.<br />
<br />
In the same way we expect end users to take basic security precautions (good password practice, not falling for phishing, following process, policy and procedure, realising when they did a stupid thing and what to do about it, etc.) exemplify those behaviours yourself, and extend that ethos and practice to "professional" or "expert" level. My wife absolutely loses her *&%^ every time she has to log into something, or a UAC prompt comes up, or *anything at all* happens that isn't directly something she wants to do - or, worse, "gets in the way". This is, I think, pretty much a good model for a "normal" user; work to change that, and remove friction where it is possible and sensible and safe to do so; and convert them to understanding why those controls are there, what they do and why they are important <i>to that user themselves</i>. "If you want to know who someone really is, see who they are when they use a slow computer".<br />
<br />
Even if it's in your spare time (it's almost fun sometimes. OK, OK, it *is* fun!), think about the overall environment and what the threats are. Figure out the risk (in likelihood*impact format), and see if there are some real priority issues you can bring to the appropriate forum to get addressed. Then go down the list, chipping away. Before long, you're better than a significant majority of other enterprises, and, unless you're really interesting, it's likely people will move on to easier pickings. Certainly, you want to get to the point that determined script kiddies are thwarted at all times, and a determined and persistent advanced adversary is the only thing likely to get in. By the time you're operating at that level, you <i>will</i> need a SOC to respond to alerts and mitigate threats in near-real-time. Users tend to be amongst the biggest "wild cards" in securing IT. Privileged users, even more so! If a new threat is hitting mainstream IT media, you know it's something you need to assess for likely impact, and patch/mitigate if it's applicable. In larger environments, compile a "risk register" - obviously, this is privileged information and should not be widely accessible - and address them - and get management to formally accept things that cannot be changed or cannot be totally prevented. In some cases, IT's role is to identify problems. Management assigns priority or accepts things as they are. It's not acceptable, I think, for you to <i>not</i> flag problems you identify through the appropriate channel. Sadly, it is sometimes appropriate for management to say "yes, the CEO can have his cat's name as his password, and yes, the cat's name is in his corporate profile on the website because CAT IS LIFE". You might want to opt out of such an organisation... C. Y. A, squared! (cover your ass by raising your professional opinion on the matter, then "see ya" - get another job).<br />
<br />
Later versions of PowerShell (5.x) have a lot more controls. PowerShell is useful for managing systems. That also means it's useful for <i>attacking</i> systems. A lot of IT is like this - there are better configurations; there is logging and security controls built in; sysadmin tools can be turned around offensively ("<a href="https://medium.com/threat-intel/what-is-living-off-the-land-ca0c2e932931">living off the land</a>"). Leverage these as much as you can.<br />
<br />
If you don't have money to spend on problems, there are cheaper (often free) ways to achieve quite a lot. If for instance, you don't have money for a heavyweight SIEM, you can accomplish a lot with windows event logging. Indeed, people have built up a whole system around it - <a href="https://docs.microsoft.com/en-us/archive/blogs/jepayne/weffles">WEFFLES</a> (<a href="https://github.com/jepayneMSFT/WEFFLES">github</a>). Even fairly basic enterprise firewalls may have features you can use (vulnerability scans, reporting) that help you get a better picture of your environment. I once caught malware simply because one of the client ports looked wierdly busy when I started graphing switch ports with cacti - a few moments of investigation showed this to be malware traffic (it's long enough ago I can't recall what it was, and this was before I'd learned to take notes...). Leverage tools that are included in your licensing - SUS, WDS and MDT, along with Group Policy, are really strong ways of getting most of the way to consistent and secure machine configuration in Microsoft environments once you take the time to configure them correctly. You may find some use in an MDM solution for BYOD, too - I found a lot of very useful knobs in the one Google allows you to set up in gsuite. The less you control BYOD, the less you should trust it (so partition your network(s) appropriately!).<br />
<br />
Remember that you can spend a<i> lot</i> of time and effort setting up control and monitoring - if nobody checks those things, it's wasted effort (and if the alerting is poorly tuned, alert fatigue will erase their benefit). With no FTE or job profile KPA dedicated percentage on security, you have two basic choices - give up, beyond the absolute basics, OR ensure that your manager understands that "systems administration: 50%" implicitly means "Security: 10%" - or whatever is reasonable. A third path, of course, is to fight for appropriate necessary change.<br />
<br />
If nobody in your team has the skills or experience to systematically think through these issues, IT Auditors will usually cover the basics, and pen-testing firms will certainly find you some things to work out. Many businesses are more likely to "splash out" on external expertise than hire dedicated FTE or up-skill existing employees. It's sometimes tragic how often such a report will say the same things the sysadmin team (or person!) has said for years - but <i>now</i> management believes it.<br />
<br />
<h2>
Working <i>with</i> management</h2>
In some organisations, there is a fairly good working relationship between IT and Management - management trusts IT to suggest and deliver business-appropriate IT policy and procedure (and of course systems and services), and IT expects management to back that up through enforcement, political support of "unpopular" decisions, appropriate staffing, training resources and budgeting.<br />
<br />
Not everywhere works like this!<br />
<br />
I have, for instance, seen a relationship where a sysadmin keeps asking for hard drives (storage is <i>kind of important</i>) in order to run systems, decent backups, and the like, and their manager never approves this. Imagine if this organisation was hit by the scenario above. What chance would the sysadmin have to maintain multiple independent copies of the data - let alone test recovery of a backup set - if they scarcely had enough space for day-to-day operations...? I've never worked in an organisation where this was a problem - a well-reasoned request, explaining why it is needed, to be charged against an approved budget is not usually turned down - even by non-technical managers. But it happens!<br />
<br />
On top of this, in larger organisations, IT departments have management that sysadmins and the like have to work through (or, ideally, <i>with</i>). Eventually, you will hit the limits of what they understand about particular technologies or problems. The trick is, I think, to learn to manage-your-manager or "delegate upwards" - it's possible you know more or have thought more about particular types of problems, or know the low-level configuration details that are a potential problem (but not necessarily all business needs or about some trade secret or nascent deal). Present your ideas in the language of management - risks, rewards, strengths, opportunities, threats; profit, loss and cost - (unless you have a very technical manager who already "gets IT", of course). Don't (ever) overplay your hand - and try to be specific to your business. Be "useful" to your manager or the HOD - consider presenting them with good ideas in ways they can easily present to C-level peers in the relevant management fora, and that is "implementable" (has budgets, rationales, roadmaps, etc.), and has sound business-related outcomes. Present the ideas in the format they like. If they're face-to-face people, discuss it in a one-on-one; if they like text, send them a document - etc. Offer to follow up a basic "pitch" with a more developed proposal. If your ideas are smaller projects with less (political) impact, they may just immediately tell you to implement them (yay). Whilst you are likely not a lawyer (and should be careful with drafting policy in highly regulated industries), be familiar with appropriate legislation, policy and regulation, and show how your suggested changes will help meet those, if applicable.<br />
<br />
I've never worked in "giant" organisations, but I'm certainly seeing that larger organisations tend to have more politics, silos and other "human factors" that sit between sysadmins with a job to do and them getting that job done, or struggle to influence change which they are otherwise perhaps well placed to inform (if not drive). IT people in small organisations are more like the Robin Williams genie in Aladdin - Phenomenal Cosmic Power, <span style="font-size: xx-small;">itty bitty living space</span>. In other words, what you can achieve kind of depends on what IT is like where you work. If you're the only IT person, the great thing is you do everything. And the terrible thing is you do <i>everything</i>. It's hard to work on projects when you're filling the printers with paper or dealing with every single desktop end user issue all day - you immediately need <a href="https://amzn.to/3i1aRFU">this book</a>. But you're likely to wield a lot of power (hopefully with discretion), so you can make big changes and implement big ideas with a lot less oversight or control - be careful how you wield that! Don't be a loose cannon; get approval for changes; learn how to communicate with non-technical decision-makers. In larger organisations, you might be shielded from "Tier 1" problems - but you probably won't have the entire environment in your head, and you may not have access to the entire infrastructure (this can be a surprising change when you move into increasingly larger organisations).<br />
<br />
Even in small organisations, you should run risky or disruptive changes past management for approval. There is always, of course, a risk that you will work with a manager or in an organisation that doesn't heed your advice, or refuses to support you in important changes. In such cases, keep a paper-trail showing that you've suggested (workable) solutions that mitigate defined risks or meet particular needs, and they have turned these down. You do not want to be the scapegoat! In rare cases, depending on the organisational culture, it may be necessary or warranted to "skip a level" and go up the organigram - be damn sure you've exhausted the normal avenues, and that it is acceptable, but in some cases, it's perhaps a move you need to consider - and realise you may burn (possibly career-limiting) inter-personal bridges doing so. In other cases, you may find that managers are swayed more by group consensus than individual "good ideas" - if the entire team says "we MUST do X", that may help. If your job description says you are "responsible for X" and doing X properly is <i>impossible</i> with the resources you have, make sure you cover this in writing, and exhaust all avenues. In some extreme cases, leaving such an organisation (or manager) may be the only recourse for your sanity or to preserve your personal or professional code. It is of course possible you're wrong, or they are not telling you something vital, so do be wary of how far you push things - but if they are flying in the face of standards and norms and it seems reckless or dangerous, well, as the three letter initialism goes: C. Y. A.! In larger organisations, you can achieve quite a lot working "horizontally" - go speak to people in other areas about things you're worried about, or good suggestions they can consider. Also, realise sometimes other people will claim credit; try and move on when that happens to you, as there's not likely much you can do about it. "Making your boss look good" is part of the territory in most (all!) jobs, even if it's not explicitly a key performance area or written explicitly in your job profile. And you know you <i>got IT done</i>!<br />
<br />
There is a risk that IT is always seen to "cost money" and "say no" - learn how to turn business needs into a "Yes, of course, and this is how you do that<i> securely</i>". That includes business needs that <i>your business doesn't yet understand it has</i>; as an IT professional, it's your job to identify those! Perhaps it seems ridiculous that you have to understand business rather than business understanding IT - but that is what being professional is about - realise you're there to figure out how to make IT work <i>for the business</i>, not merely just work. The self-evident opposite to that - that all businesses are now IT businesses (and therefore need to "get" IT) - probably hasn't yet filtered through, so you need to work around that!<br />
<br />
It's similar to how I learned - with regards to science communication - that as interesting and vital as I thought something was, it was important to translate it into words, images or feelings(!) that are relevant <i>to the audience</i> ("what's in it for <i>me</i>?"; "why should *I* care?"). Similarly, if I wanted an academic to write an article, they would spend a lot longer modifying a straw man article I sent to them than it would take them to write the same article from scratch - but they would never start from scratch! Give people information the way they want it, written from their point-of-view and highlighting their interests or needs - you're already most of the way there. Give people a path-of-least-resistance to follow that aligns with their mutual self-interest, and they will follow it! (aka "Make doing the right thing the easy thing").<br />
<br />
So, work out how to tailor your internal business communications in ways that are effective, and realise that what works can be surprisingly context-specific. Learning how to do this can drive a great deal of professional satisfaction - doing it well builds another kind of very powerful trust. Of course, if you write too well, you may be turned into "the documentation person" - which isn't necessarily a <i>terrible</i> fate... :)<br />
<br />
Good luck out there; google_moar, work with management - and get patching! And don't forget about those trust boundaries...James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com0tag:blogger.com,1999:blog-5949954951402579361.post-83382553914741095422020-01-17T15:58:00.000+00:002020-01-17T15:58:24.980+00:00Happy Eyeballs. Unhappy user. As part of the migration efforts to IPv6, many programs implement a system known as "<a href="https://en.wikipedia.org/wiki/Happy_Eyeballs">happy eyeballs</a>". The basic premise is that sometimes, IPv6 "is flaky", and after a while, you should give up and take the IPv4 option - resulting in some "happy eyeballs". In a dual stack system, IPv6 is preferred.<br />
<br />
Well the thing about this is it is S L O W... and users (not unreasonably) get grumpy about it. Here's a case where something went wrong, and *everything* was subjected to this delay.<br />
<br />
<a name='more'></a><br />
<br />
Some background first. The network here has run, in one form or another, IPv6 for over a decade. We don't generally find IPv6 to be a problem. These days, most wired LAN segments are dual stacked, and It Just Works(TM).<br />
<br />
Today, I received a call from a perplexed user that said "The internet is slow. Does it have anything to do with the building work?" (there are extensive re-painting and re-roofing efforts underway). Builders tend to kill networks dead rather than slow them down (cue wailing and gnashing of teeth - their scaffolding has been erected smack in front of one of our few point-to-point wireless bridges, and metal beats 60GHz wireless signals, every time). I was of course aware of the <a href="https://twitter.com/RENAlerts/status/1218077249099923456">massive international cable outages</a>, but fortunately, our ISP has routed around that, because they had enough spare capacity to sort it out - and nobody else was complaining about "slowness", and several of the sites and services I use regularly were troublesome for them, but not for me in my office. I checked monitoring, found no outages around that building (and had received no SMS alerts, but I double-check such things) and logged into the switch, found which port they sat behind, and it all looked good (settings all correct, duplex and speed as expected) - having systems that map user to mac are pretty useful. Ping to their PC was steady, with no packet loss. All the obvious sources of problems checked, I paid a visit to the PC in question (this is unusual - PC problems are usually left to general IT support techs here, not networks team or sysadmins - silos!) - certainly the problem "felt" odd, so I thought I should check it out.<br />
<br />
First, I simplified the network, removing the passthrough VOIP handset from the equation.<br />
<br />
We then tried to reproduce the error, which was easy. No joy, problem still there.<br />
<br />
Carefully observing for patterns, I noticed some common elements - firstly, the resolution of the DNS name seemed to take an unusually long time in the browser. Then, the first page would take a <i>long</i> time to load. After that, the "loaded" site was perfectly usable (on-site links then loaded fast), but generically browsing across the internet was painfully slow. Once a speedtest site finally loaded, the results were as expected. It was not specific to a particular browser (the user had already checked that before phoning us. Good job!).<br />
<br />
Next, given the seemingly slow DNS resolution, I checked to see nobody had changed DNS to something weird. Nope, normal. nslookup was pretty normal in speed and results were as expected. <br />It was "just" that websites and internet-based services were being painful.<br />
<br />
Perhaps the TCP/IP stack was a bit off. Ran the usual reset commands and rebooted. No improvement.<br />
<br />
I then thought "hmm, there's a pause, then it works again, and keeps working for a site. I wonder if it's a <a href="https://en.wikipedia.org/wiki/Happy_Eyeballs">happy eyeballs</a> thing and IPv6 has gone weird on this machine"? Turned off IPv6. <br />No improvement.<br />
<br />
Clutching at straws, I did another ipconfig /all and was amazed to see an IPv6 address under a "Microsoft 6to4 Adapter" interface I had never seen before.<br />
<br />
When I had disabled IPv6.<br />
<br />
It turns out that somehow (I have no idea how), the hidden Microsoft 6to4 adapter had been installed on this machine. Uninstalled that, and things got back to normal. Turned IPv6 back on again, and it carried on working just fine. For whatever reason, it looks like disabling IPv6 on the interface does not disable the 6to4 virtual network adapter.<br />
<br />
The problem came down to that rogue 6to4 adapter, and it took me rather longer to figure out than I was happy about.<br />
<br />
Sadly, I didn't take a gander at the routing table (this was a user PC problem to be fixed, not an interesting specimen to study!) - I rather wonder why the PC was preferring that to the native IPv6 addressing.<br />
<br />
All of the "<a href="https://www.ripe.net/publications/ipv6-info-centre/deployment-planning/transition-mechanisms">transitional</a>" IPv4/IPv6 stopgap migration tunnel protocols are blocked at our border firewall - we do native IPv6, so why should we allow tunnels etc? - ISATAP, Teredo and 6to4 are all explicitly blocked (we block very little outgoing stuff - Universities are MUCH more open than most businesses and K-12 schools).<br />
<br />
So, of course, when the user's device tried to get to a site over IPv6, the requests travelling along the 6to4 tunnel were dropped at the border firewall. Once the happy eyeballs timer expired, it fell back to IPv4 and things "just worked" - for that site. As more sites and services are available over IPv6, such a problem seemed to be a widespread issue - but it all came down to that mystery rogue logical adapter. Thanks, Windows.<br />
<br />
There are <a href="https://support.microsoft.com/en-za/help/980486/a-new-microsoft-6to4-adapter-is-unexpectedly-created-after-you-restart">known bugs</a> (but not AFAIK in Win10, which this was) that cause spawning 6to4 adapters. This was just one, however, on a wired interface.<br />
<br />
Lesson I learned for next time that might have told me the problem sooner? Check firewalls for blocked packets when things go slowly, not only when they just don't work at all.<br />
<br />
Of course, now I've *seen* this class of error in the wild, I'll recognise and fix it that much faster next time.James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com2tag:blogger.com,1999:blog-5949954951402579361.post-86525447758574173852019-12-03T16:40:00.000+00:002020-03-11T08:39:49.909+00:00Port Mirroring in JunosSo sometimes, the manual isn't quite detailed enough...<br />
<br />
At the moment, our telephony people are having some issues with various things like handsets logging themselves out under certain conditions.<br />
<div>
<br /></div>
<div>
Vendor support has requested packet captures, and rather than schlepping around to the four corners of campus, dropping off laptops by each switch stack, I thought "wait, Junos does port mirroring, and it looks straightforward".<br />
<br />
It is, and it isn't...</div>
<br />
<a name='more'></a><br />
<div>
The documentation suggests it is as simple as<br />
<blockquote class="tr_bq">
<i>set forwarding-options analyzer your-portmirror-name input ingress interface <interface you want to monitor><br />set forwarding-options analyser your-portmirror-name output interface <interface you'll plug your packet capture device into></i></blockquote>
<br />
Simple.<br />
Did that.<br />
Got nothing.<br />
Only saw the capture laptop's internet background radiation traffic.<br />
<br />
I suspect this was because the destination port wasn't configured in ways that were helpful, either because the configured port type (perhaps <i>family ethernet-switching</i> was needed) or the interface-mode (configured as <i>interface-mode trunk</i>) not being set meant that Junos was not spitting out packets there. It should be noted that the input source interfaces I was playing with are tagged trunks with multiple VLANs on them, which may play a role; it may be that non-tagged interfaces behave differently (this suggests some lab experiments for another day).<br />
<br />
So, once I set the capture interface to trunk mode, verily did my laptop drink from the firehose of the uplink ports of two large switch stacks during normal office hours:<br />
<br />
<blockquote class="tr_bq">
<i>set interface <your capture device interface> unit 0 family ethernet-switching interface-mode trunk</i></blockquote>
<br />
You should usually also set your capture interface to grab egress frames (most people care about bidirectional traffic by the time they pull out wireshark). So for a complete capture solution for two interfaces, capturing ge-0/0/2 and ge-1/0/2 and outputting that to ge-0/0/14 with an analyzer called telephony-analyzer:<br />
<br />
<blockquote class="tr_bq">
<i>set forwarding-options analyzer telephony-analyzer input ingress interface ge-0/0/2</i> </blockquote>
<blockquote class="tr_bq">
<i>set forwarding-options analyzer telephony-analyzer input ingress interface ge-1/0/2</i> </blockquote>
<blockquote class="tr_bq">
<i>set forwarding-options analyzer telephony-analyzer input egress interface ge-0/0/2</i> </blockquote>
<blockquote class="tr_bq">
<i>set forwarding-options analyzer telephony-analyzer input egress interface ge-1/0/2</i> </blockquote>
<blockquote class="tr_bq">
<i>set forwarding-options analyzer telephony-analyzer output interface ge-0/0/14</i> </blockquote>
<blockquote class="tr_bq">
<i>set interface ge-0/0/14 unit 0 family ethernet-switching interface-mode trunk</i></blockquote>
(this assumes ELS platform syntax on EX gear, others may be subtly different)<br />
<br />
Obviously, you need to consider if the combined bandwidth will exceed the size of your monitor port and size correctly (or apply filters). At high enough packet rates, you may even need to think carefully about capture hardware. Fortunately, our packet capture requiring experiment is planned after hours during a maintenance window, so hopefully things will be calmer.<br />
<br />
Another way of getting to the traffic you care about is only mirroring a VLAN - often the problem is restricted to a particular VLAN, and you'd like to dig into that, so instead of setting the ingress and egress interfaces, set ingress and egress VLAN ID, and the capture interface:<br />
<br />
<blockquote class="tr_bq">
<i>set forwarding-options analyzer telephony-analyzer input ingress vlan <ID></i> </blockquote>
<blockquote class="tr_bq">
<i>set forwarding-options analyzer telephony-analyzer input egress vlan <ID></i> </blockquote>
<blockquote class="tr_bq">
<i>set interface ge-0/0/14 unit 0 family ethernet-switching interface-mode trunk</i> </blockquote>
<blockquote class="tr_bq">
<i>set forwarding-options analyzer telephony-analyzer output interface ge-0/0/14</i></blockquote>
<div>
<i><br /></i></div>
Another option you can use is to set up a port mirroring VLAN (aka remote monitoring), and there are ways of further filtering the captured traffic with firewall filters if you need to do that (see the further reading links below), but I've not explored either of these as I had easy physical access to a port, and the vendor requested no filtering of capture files (!?). Throwing mirrored frames into a VLAN is a pretty handy way of dropping traffic off where you'd actually like to receive it, particularly on relatively flat networks.<br />
<br />
<h2>
Filtering is the better part of valour</h2>
<br />
Despite the vendor saying "no, no, no, don't filter anything, just break the packet capture up in to 500Mb chunks (!)" I think one ought to carefully consider privacy and information security when supplying packet captures to 3rd parties (or even having them floating around on your disk for any length of time).<br />
<br />
Sure, lots of things are https:// these days - but not <i>everything</i> in a LAN is, and some (V)LANs are sensitive (this one potentially contains much finance and HR traffic, and any ongoing voice calls). Post capture, you should therefore filter down to only that which is relevant - things like <a href="https://osqa-ask.wireshark.org/questions/35546/filter-mac-address-of-a-particular-manufacturer">OUI filtering </a>will be rather helpful for vendor specific troubleshooting - and only send them that.<br />
<br />
Sometimes you need to see a packet capture before you know what is relevant, and if the vendor asks for more, you can filter it out and send it. Of course, normally you'd filter during capture (depending on how well you understand the likely problem), but sometimes following the letter of the vendor's instruction, and subsequently modifying it (with the option of going back to a "full" capture to tease out anything you inadvertently missed) is easier in the long run.<br />
<br />
When you're capturing packets in Windows, it (in)conveniently strips VLAN tags (at least it did on the 3 different laptops I tested this on), so be sure you know what traffic belongs in which VLAN if you're capturing traffic from more than one at a time (IP addresses help, but when you're diagnosing L2 stuff like DHCP, it's not going to be too obvious). <br />
<br />
When you're doing really big captures, you'll probably want to stream to disk using capture options, likely with the file size splitting option. Here, capturing a single VLAN at a time, or moving to a capture platform that preserves VLAN tags, would be useful. If you need to capture several simultaneously and not get them jumbled up, you should be able to set up (provided you have enough interfaces available) multiple port mirrors, one per VLAN. Alternatively, configure a device that <i>doesn't </i>discard VLAN tags to capture your frames (most linux-y things tend to be better at this, and tcpdump has plenty of useful options).<br />
<br />
<h2>
Further reading: </h2>
<br />
<ul>
<li><a href="https://www.juniper.net/documentation/en_US/junos/topics/task/configuration/port-mirroring-cli-els.html">Configuring Mirroring on EX4300 Switches to Analyze Traffic (CLI Procedure)</a></li>
<li><a href="https://www.juniper.net/documentation/en_US/junos/topics/example/port-mirroring-remote-ex-series-els.html">Example: Configuring Mirroring for Remote Monitoring of Employee Resource Use on EX4300 Switches </a></li>
<li><a href="https://www.juniper.net/documentation/en_US/junos/topics/reference/configuration-statement/vlan-edit-ethernet-switching-options-analyzer-qfx-series.html">vlan (Port Mirroring)</a></li>
</ul>
<br />
<br /></div>
James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com0tag:blogger.com,1999:blog-5949954951402579361.post-35431777941671135912019-10-23T10:12:00.002+01:002019-10-24T07:37:46.382+01:00Glitchy RPKI OV and uRPF: one hell of a matchI've <a href="https://schoolsysadmin.blogspot.com/2019/07/securing-internet-routing-rpki-ov-and.html">previously written about RPKI and mentioned some of the glitches</a>. It looks like there are some other fun interactions to be had if you're implementing all the "best practice" internet edge routing for your customers...<br />
<a name='more'></a><a href="https://www.manrs.org/">MANRS</a> is a great idea and initiative. I fully support the aims, and accept that there are (occasionally) going to be teething issues. We've however run into some "interesting" problems where RPKI hasn't worked terribly well. Some time last night, we ran into another one.<br />
<br />
Somehow, the router we peer with at one of our upstream ISPs lost its sessions with their RPKI OV validation server. This meant that as they learnt and evaluated our routes, they were being marked as "not found". Of course, they also had routes to our prefixes from our other ISP that were still marked as "valid", so that was marked as the preferred path.<br />
<br />
Of course, our edge routers were still merrily choosing that route for about 50% of the destinations on the Internet, and somewhere between the ISP's Provider Edge router and the rest of the internet, they vanished into a black hole.<br />
<br />
We saw the following things that looked really odd:<br />
<ul>
<li>ping testing to the peer router failed</li>
<li>traceroute past the peer router failed (but revealed the ISP's immediately peered router, but no further)</li>
<li>we could re-establish BGP sessions and prefixes would happily be exchanged, but still the "internet was broken". </li>
</ul>
<div>
So after some troubleshooting, it turned out RPKI OV at that ISP had failed again. Which lead me to wondering why, if we were delivering packets, they weren't actually going anywhere else. There are valid routes, and ultimately, packets will normally get through somehow by delivery via as many routers as is reasonably necessary. My strong suspicion is that, having received packets from a <i>not found</i> RPKI route/interface (which the router will therefore not install as an active, valid route), <a href="https://en.wikipedia.org/wiki/Reverse-path_forwarding">uRPF</a> looks at that and goes "my dude, packets from that prefix are being received over a path that is not the best path back to that origin network. This path over <i>here</i> from that customer's other ISP is the path, because it's marked RPKI valid. These here packets MUST BE FRAUDS. Drop them!".<br />
<br />
So, buggy RPKI OV plus strict uRPF is a packet dropping perfect storm for people who are multi-homed.<br />
<br />
<a href="https://www.manrs.org/isps/">MANRS</a> suggests one ought to be doing both uRPF and RPKI OV, and we have an ISP that likes to do things properly (with a routing equipment supplier that has some glitchy implementations of some of the features needed). </div>
<div>
<br /></div>
<div>
Normally, when our link to the ISP fails, it's because somewhere along the light path from our secondary datacentre to their PoP nearly 1,000kms away, someone's fired a shotgun at the fibre, or stolen a section hoping it's copper, or run a backhoe through the cable, or a veldt fire has burnt through the fibre, or strong wind and rain have caused the massive long distance electricity pylons it's mounted on to fall over. (Yeah, we get quite a lot of faults. No, none of them are covered by the SLA, Yes, the root cause analysis sometimes brings some amusement). But every so often, it's something else. </div>
<div>
<br /></div>
<h2>
Descent Into Madness: Starting the Day off Just Wrong. </h2>
<div>
<br /></div>
<div>
In this case, we actually still had a link to the ISP, but traffic failed hard. Unfortunately, it took rather longer to diagnose the problem than might be ideal, or normal. </div>
<div>
<br /></div>
<div>
The perfect storm ran something along the lines of me getting an early morning message from my early rising colleague saying the link to that ISP was down. Having recently shown them the steps I take to diagnose that connection, I assumed that it really was down. And, well, fine, that doesn't cause us any kind of issues, other than higher risk for an outage or slower connectivity if the other ISP loses some links. I'll drop the down ISP a quick mail from my phone before I even get out of bed asking them to check for an outage on the fibre, and look into it in the office. <br />
<br />
Got into work and various people were saying "DNS was broken" and "what is bootstrap" (a couple of random DNS servers we check to see if we have working DNS recursion happen to be called "bootstrapcdn")? Let's pull out <i>dig +trace</i>. A LOT of DNS was broken. Eventually, I thought "right, this is widespread enough that it looks like a routing/connectivity problem, not a DNS problem". Either that, or there is a MASSIVE DDOS happening somewhere really important. Please, sinus headache, leave me alone so I can think of the commands I need. Also, coffee. Lots of coffee. </div>
<div>
<br /></div>
<div>
I then did<b> the </b>basic connectivity check any network engineer does - traceroutes to problematic destinations. </div>
<div>
<br /></div>
<div>
Hang on a minute. </div>
<div>
<br /></div>
<div>
What the hell? </div>
<div>
<br /></div>
<div>
How are traceroutes getting to an unpingable peer router and stopping there? Why are they even <i>attempting</i> that route if the fibre is down? </div>
<div>
<br /></div>
<div>
Maybe it's not as unreachable as people said. Maybe, just maybe, it's not as "down" as it appears we've now clearly <i>assumed</i> it was. </div>
<div>
<br /></div>
<div>
Let's go look a the BGP summary on the edge routers. </div>
<div>
What? The sessions are up and established to that router our monitoring system can't ping? Hmm. Let's clear the peering sessions. What? they re-establish and prefixes are exchanged and accepted? Ok. That's weird. </div>
<div>
Must be some kind of problem with the upstream ISP. </div>
<div>
Let's kill the peering session. </div>
<div>
Oh look, working Internet. </div>
<div>
<br /></div>
<div>
Hi, ISP, so, sorry, we think it's not a fibre problem, but some kind of weird routing glitch or filtering issue at your end, as we can establish sessions and exchange prefixes - please can you check? (Back of my Mind: "I wonder if it's the revenge of the buggy RPKI again"). </div>
<div>
<br /></div>
<div>
A session re-establishment later, and they have some info to look through. </div>
<div>
<br /></div>
<div>
"Oh yeah, sorry, it's was broken RPKI OV again, try now". </div>
<div>
<br /></div>
<div>
Yeah, that's working. </div>
<div>
<br /></div>
<div>
Facepalm. </div>
<div>
<br /></div>
<div>
<i>Hey network engineering colleague, let me go over troubleshooting connectivity again. </i></div>
<div>
This step *here* is really important. <i>Ask your routers what they see</i>. Be very suspicious of an un-pingable router that has established BGP sessions! </div>
<div>
BGP is, in some ways, really dumb. It assumes if you can reach the other router, it's going to deliver your packets. That's not necessarily the case. </div>
<div>
Pinging a remote router is a pretty basic test that doesn't tell you everything we need to know, and, well, can make a ASS out of U and ME. Tells you <i>something</i> might be wrong, but not necessarily <i>what</i>. A failed ping test is a failed ping test, no more, no less. It's an invitation to dig (or traceroute, or any number of other commands :) ) deeper.<br />
Oh, p.s. here's the documentation on massaging routes in our systems. Yeah, it's in the wiki. Yeah, I also read the documentation I write when running commands I rarely run. Particularly when I can't brain, because it's too early, my sinuses are bastards and there is no coffee. </div>
<div>
p.p.s traceroute. It is your friend. It will show you weird broken crap really fast. </div>
James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com2tag:blogger.com,1999:blog-5949954951402579361.post-58340210567164907332019-07-02T20:06:00.000+01:002019-07-02T20:06:07.855+01:00One year on, in a different kind of school...A year ago (to the day), I started what I would seriously consider a "dream job title" - that of Network Architect at a University (at the first place in sub-Saharan Africa to have an Internet connection - so long ago that its first IP address allocation was done within an RFC!). It ticked all my boxes in terms of favourite things to do, and promised to throw me in at the deep end in a much more challenging environment. I didn't have any particular desire to leave my last position (which I enjoyed immensely), but when I got two phone calls some time apart strongly suggesting I ought to apply for the position, and with an interesting list of "things we need done", after much hand-wringing, I jumped at the chance, and made it through the selection process. #GreatSuccess.<br />
<br />
<br />
<a name='more'></a><br /><br />
I've had to learn a lot quite quickly, but I'm blessed to have people that know more than I do as colleagues (my line manager was the previous incumbent of my position, and knows a <i>ridiculous</i> amount about all IT stuff, and the team I supervise has combined decades of experience on this campus at lower "layers", so they literally know where the skeletons are buried)! There are useful technical books lying around the place you can just sit down and read. There are several sysadmins lurking around the corridors, and a functioning helpdesk that helpfully means I typically only have to deal with "hard" problems, and my phone rarely rings (although I get quite a lot of SMS alerts from our monitoring system!).<br />
<br />
The position has very much lived up to my expectations - it's given me a much bigger, more complex environment to understand. It's more corporate (with the good and bad sides that brings). I have more toys to play with (and they're <i>shiny</i>). I get to work on pretty serious systems, and with some pretty complex topologies (simple enough to sketch on a napkin, but the details can bite you).<br />
<br />
It's not all sunshine and roses, of course. Like most organisations, resources are not limitless, and the environment in Higher Education is particularly very constrained financially, and because of the nature of ICT purchasing and the impacts of foreign exchange fluctuations hit us hard (right in the shiny). That just means we have to be a little bit more creative in our solutions to problems, and compromise on some things (which I've done my whole working life). A key side effect of this is there is a lot more open source software around - and the problems are unique enough (or have been annoying sysadmins here for long enough) that there are custom tools and toolchains to do things - one tool we use continuously to manage IP addresses has been in use since at least 1995. Most things are very "Unix-y" (mostly FreeBSD), and so I've had to up my scripting game considerably - but there are enough people with clue around that interesting problems get fixed quite fast if you do something like put it in a ticket that explains the problem and possible solutions - with the intention of me sorting them out, but I do not look gift horses in the mouth - examining their dentition is a good way to learn, too, and I have people to ask questions of! It's been surprising how much more "silo-d" the organisation is, even just within the IT services division under a single Director - to some degree, it's a pretty classic Ops / Dev / Support split, complicated by how Universities work, and the degree of freedom research inevitably calls for in terms of ICT process and policy.<br />
<br />
The team I manage is growing; it ranges from a facilities management person (who deals with things like attending building planning meetings and managing HVAC and the like) - layer 0, if you like; 3 network technicians of various grades (Layer 1, with some Layer 2 config), and ultimately two network engineering posts (Layer 2-3+), under the Network Architect. The Networks team, aside from the obvious wireless, edge, core distribution and access devices are also nominally responsible for "networking services" like DNS, DHCP, NTP, IPAM, perimeter firewalls, RADIUS and the like. Much beyond L3, it's onto the Ops sysadmins or AppDev/DMU people. And the Ops sysadmins mostly deal with all the servers that run the networked services within their normal maintenance, so that's one other task off my chest.<br />
<br />
We had a fun six months (!) on-and off fighting a horrible bug in a large enterprise vendor's wireless controller software - we kept trying to update beyond a fairly ancient version, with catastrophic results every time (tossing almost all the APs and clients off the network every 5-15 minutes). I'm still amazed it took as long as it did for them to finally admit that it was a problem in their controller software, and then do something about it. Fortunately, you could just roll back to an older version and regain stability, but there were a lot of late nights on Tuesdays and long phone calls to the vendor's tech support during maintenance windows! Why did we want an update? Well, the "stable" version was pre-KRACK patches; everything that crashed was post KRACK patches! We are in an environment where people like to "play"... <br />
<br />
There's quite a bit of training that needs to be done, as we're actively building the team (and people are being promoted into new areas they've got relatively little experience in). Specialised recruitment here is hard (it's a small town, and salaries at Universities don't compete that well with national, let alone international, trends in IT - but you get to do interesting things in an interesting place, often with a lot more freedom than in a similarly large "corporate" environment). I enjoy mentoring people, so we can certainly "grow our own", to some degree, but it's quite hard to find the time when you yourself need to fix things (or learn it first!).<br />
Also, I think a lot of people that would do quite well in network engineering keep going across and becoming developers, thinking that's where all the action's at. Sorry, your app does nothing until the network, well, works! With the rise of SDN and infrastructure-as-a-service / network automation / DevNetOps, there's a <i>lot </i>of dev work to be done within networking teams these days.<br />
<br />
I've finally, I think, gotten to grips with enough of the immediate macro- and micro- things on the campus that it's time to tackle some bigger projects, rather than fighting immediate fires. Certainly, I have a nice list of things to do growing in the Network Engineering queue...<br />
<br />
So, whilst I'm no longer a K-12 "school sysadmin", in North America, they still refer to higher education as "school", so I guess in some ways, I still therefore do "school sysadmin", and there's no need to change this blog's name!<br />
<br />
It's been fun, and it continues to be fun!James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com0tag:blogger.com,1999:blog-5949954951402579361.post-63469094800756788742019-07-02T18:35:00.001+01:002019-10-24T07:33:56.065+01:00Securing Internet Routing: RPKI OV and ROAsFor some time now, I've had a ticket in my queue to "Investigate RPKI". A few weeks ago, we experienced some strange internet outages that turned out to be because not all is well with RPKI Origin Validation at one of our upstream ISPs...<br />
<br />
<a name='more'></a>The Internet owes much of its success to the marvel that is <a href="https://en.wikipedia.org/wiki/Border_Gateway_Protocol">BGP</a> based routing. Of course, like so many "early" Internet protocols and standards, it's <i>very</i> open and trusting. Just about anyone can claim to be a route source (hence "<a href="https://en.wikipedia.org/wiki/BGP_hijacking">BGP Prefix Hijacks</a>" are a thing). This is sort of an "Achilles Heel" in the global Internet.<br />
<br />
The Internet community has, therefore, long thought about more secure ways of doing routing, but they move slowly, as BGP is (by necessity) conservative in changes. There are a number of IETF Working Groups working on an overall solution called "Secure Inter Domain Routing" (<a href="https://datatracker.ietf.org/wg/sidr/about/">SIDR</a>, not to be confused with <a href="https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing">CIDR</a>!).<br />
<br />
Part of this is <a href="https://en.wikipedia.org/wiki/Resource_Public_Key_Infrastructure">RPKI</a>, and the first really workable section of that is Origin Validation (OV). This takes three parts; cryptographically sound declarations of valid origin AS numbers (ROAs) for specific prefixes; validation servers that routers can query; and then routers acting on that information (OV). The first one is very easy and relatively low risk (indeed NOT creating and publishing a ROA carries a risk if your ISP is doing OV); the other two, considerably less so.<br />
<br />
<h2>
RPKI and ROAs</h2>
<br />
Route Origin Authorizations (ROAs) basically establish a chain of trust (similar to a TLS certificate CA) from the Regional Internet Registries (<a href="https://en.wikipedia.org/wiki/Regional_Internet_registry">RIRs</a>) to cryptographically signed declarations of what Autonomous Systems (ASs) ought to be originating a prefix (advertising reachability as the last step in the routing pathway [AS-PATH]). This makes sense, because the RIRs are the organisations that "dish out" both IP address resources and AS numbers, so they know which organisations legitimately should originate which prefixes.<br />
<br />
Clever people thought that it would <b>not </b>be a good idea to have routers try to solve Internet scale problems. They have created a number of servers (software diversity can be a healthy thing) that essentially parse the ROA feeds of the various RIRs (and some other sources) to determine what ASs can validly originate a prefix. There are "public" servers of this sort you can use, but they're really something you want to have a high degree of trust in - so running your own, or relying only on those operators you *really* trust is wise (if your routers do Origin Validation). Relying Party servers are available from several places, including: <a href="https://github.com/RIPE-NCC/rpki-validator">rpki-validator</a>; <a href="https://nlnetlabs.nl/projects/rpki/routinator/">Routinator</a>; <a href="http://rpki.net/">rpki.net</a>; <a href="http://rpki.realmv6.org/">RTRLib</a>; <a href="https://medium.com/@jobsnijders/a-proposal-for-a-new-rpki-validator-openbsd-rpki-client-1-15b74e7a3f65">rpki-client(8)</a>; Cloudflare have a <a href="https://blog.cloudflare.com/rpki-details/">service</a> and <a href="https://github.com/cloudflare/gortr">validator</a>; there are others if you hunt around.<br />
<br />
Routers can then use a fairly simple protocol (<a href="https://tools.ietf.org/html/rfc8210">RPKI-RTR</a>) to query Relying Party Servers (aka Validator Caching Servers) to ascertain whether or not a particular prefix is Valid, Invalid or Unknown, and act accordingly - aka Origin Validation (OV).<br />
<br />
ROA decisions of caching servers come in three flavours - "Valid", "Invalid" and "Unknown". Most prefixes are "Unknown" - they don't have a ROA covering them. "Valid" denotes that a particular prefix has an origin AS that is covered by a legitimate ROA. "Invalid" means it's not covered by a valid ROA - something about the prefix is off (either it's the wrong size, or the origin AS is not in the ROA for the prefix). Basically, you're ascertaining whether the organisation that uses AS number XYZ is legitimately the "owner" of a particular prefix (X.Y.Z.Q/NN) (a routable IP address subnet, usually expressed in CIDR notation).<br />
<br />
ROAs, it turns out, are pretty painless things to deploy (visit your RIR's site with a valid organisational login, and issue away after carefully considering your prefixes and what the adequate ROAs will be). They won't solve all possible types of BGP hijack (particularly not some kinds of intentional attempts), but they should certainly cut down on "accidental" route origination. And one other problem goes away too...<br />
<br />
<h2>
The Big Surprise</h2>
So pretty much anyone who regularly uses BGP in their day job (particularly in service provider networks) has memorised the BGP Bestpath selection algorithm for their particular brand(s) of router (e.g. <a href="https://www.cisco.com/c/en/us/support/docs/ip/border-gateway-protocol-bgp/13753-25.html">Cisco</a>, <a href="https://www.juniper.net/documentation/en_US/junos/topics/reference/general/routing-protocols-address-representation.html">Juniper</a>, other vendors). People spend a LOT of time and effort tuning various BGP parameters to carefully control how their routes are treated (and how they treat their customer's routes, in the case of ISPs) - carefully making use of that knowledge to achieve their desired routing policy (or even as basic a thing as a working Internet connection - or for <a href="http://drpeering.net/HTML_IPP/ipptoc.html">fun/profit</a>).<br />
<br />
<b>Newsflash</b>: <i>Valid ROA status supersedes ALL other Bestpath selection processes</i> - if there is a single Valid route, and all the others are Unknown or Invalid, the Valid one will be picked (even if it otherwise has a fairly low priority).<br />
<br />
Here's what Cisco (the vendor of the affected upstream ISP) has to say about Origin AS Validation:<br />
<blockquote class="tr_bq">
"By default, a prefix marked as Not Found is installed in the BGP routing table and will only be flagged as a bestpath or considered as a candidate for multipath <b>if there is no Valid alternative</b> (independently of other BGP attributes such as Local Preference or ASPATH)." p2, Cisco <i><a href="https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/iproute_bgp/configuration/xe-3s/irg-xe-3s-book/bgp-origin-as-validation.pdf">BGP—Origin AS Validation</a></i> document - emphasis mine.</blockquote>
Furthermore:<br />
<blockquote class="tr_bq">
"During BGP best path selection, the default behavior, if neither of the above options is configured, is that the system will prefer prefixes in the following order:<br />
- Those with a validation state of valid.<br />
- Those with a validation state of not found.<br />
- Those with a validation state of invalid (which, by default, will not be installed in the routing table).<br />
<b>These preferences override metric, local preference, and other choices made during the bestpath computation.</b><br />
<i>The standard bestpath decision tree applies only if the validation state of the two paths is the same</i>." p4, (both emphases mine), Cisco <i><a href="https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/iproute_bgp/configuration/xe-3s/irg-xe-3s-book/bgp-origin-as-validation.pdf">BGP—Origin AS Validation</a></i> document</blockquote>
This means that a single spurious "valid" route can wreak merry hell on your routing.<br />
<br />
Amazingly, there doesn't seem to be a "knob" to disable this behaviour for specific prefixes (like, say, your customers) - you can override what happens to Invalid, but not Unknown / Not Found - other than turning it off entirely - which rather defeats the point...<br />
<br />
We came across a fairly tragic (repeated) instance where this happened to us - a nonsense path "randomly" became marked as Valid, and that resulted in our traffic getting blackholed in some way (most likely the routing within our ISP was then b0rked, because there wasn't actually a working route to us from that "valid" router, which we were not connected to). Sadly, only one of our ISPs has a looking glass we can use to interrogate their view of the world. When traceroute fails to your ISP, you know you're having a problem...!<br />
<br />
Here's are two examples, one of a case I'm calling "Phantom Validity", and one of "Sticky Validity". Whether this is a bug in Cisco routers, a particular glitchy router or a glitch in the Relying Party Validators, I can't say, but I've sent all available info to the ISP for them to digest.<br />
<br />
Phantom Validity:<br />
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">Network Next Hop Metric LocPrf Weight Path</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">N* ia 146.231.128.0/21 41.78.189.211 0 150 0 3356 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">N* ia 41.78.189.234 0 150 0 3356 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">N* ia 41.78.189.226 0 150 0 1299 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">N* ia 41.78.189.210 0 150 0 2914 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">N* i 41.78.189.211 0 150 0 2914 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">N* i 41.78.189.234 0 150 0 2914 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">N* i 41.78.189.211 0 150 0 1299 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">N* i 41.78.189.210 0 150 0 1299 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">N* i 41.78.189.234 0 150 0 6453 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">N* i 41.78.189.226 0 150 0 3257 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">N* i 41.78.189.234 0 150 0 6762 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">N* i 41.78.189.210 0 150 0 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">N* i 41.78.189.226 0 150 0 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;"><span style="color: red;"><b>V*> 196.60.8.216 0 250 0 2018 37520 37520 37520 i</b></span></span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">N* ia 41.78.189.188 0 250 0 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">N*bia 41.78.189.156 0 250 0 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">N* ia 41.78.189.116 0 250 0 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">N* ia 41.78.189.117 0 250 0 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">N* ia 41.78.189.52 0 250 0 2018 37520 37520 37520 i</span></blockquote>
Above V is not correctly marked as "V"alid - it's "phantom" valid, and in this state, all hell breaks loose (All should be N). We saw this happen on a prior occasion to another of their routers (different IP) - and probably several other weird routing outages like this due to this glitch (we just didn't have the looking glass's view for those outages, because we spent some time thinking it was a problem with our firewalls or internal routing gone squiffy).<br />
<br />
Here's a possibly relating thing - spurious validation of a prefix with no covering ROA (expired several days previously) - all should be "N", NOT "V". - perhaps once a route is "valid" it's "sticky"? (We reset BGP connections to both upstream ISPs and that did not help). Their validator cache correctly said that this wasn't covered by a ROA - so I think "Cisco might have a problem".<br />
<br />
Sticky Validity:<br />
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">Network Next Hop Metric LocPrf Weight Path</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">V* ia 192.42.99.0 41.78.189.226 0 150 0 1299 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">V* ia 41.78.189.234 0 150 0 3356 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">V* ia 41.78.189.211 0 150 0 3356 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">V* i 41.78.189.234 0 150 0 6762 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">V* i 41.78.189.226 0 150 0 3257 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">V* i 41.78.189.234 0 150 0 6453 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">V* i 41.78.189.234 0 150 0 2914 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">V* ia 41.78.189.210 0 150 0 2914 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">V* i 41.78.189.211 0 150 0 2914 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">V* i 41.78.189.210 0 150 0 1299 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">V* i 41.78.189.211 0 150 0 1299 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">V* i 41.78.189.210 0 150 0 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">V* i 41.78.189.226 0 150 0 174 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">V* ia 41.78.189.117 0 250 0 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">V* ia 41.78.189.116 0 250 0 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">V* ia 41.78.189.52 0 250 0 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">V*bia 41.78.189.156 0 250 0 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">V* ia 41.78.189.188 0 250 0 2018 37520 37520 37520 i</span> </blockquote>
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace; font-size: x-small;">V*> 196.60.8.216 0 250 0 2018 37520 37520 37520 i</span></blockquote>
We think we're amongst the first people to see glitches like this - I had a quick google and couldn't find much about it - but it's there waiting to bite people. <br />
<br />
Apparently, the affected ISP is the "<a href="https://www.itweb.co.za/content/4r1ly7RoW9AMpmda">first in Africa</a>" to deploy RPKI Origin Validation. I've raised the issue with them, so hopefully it's something their NOC now know to look for if their customers have weird routing glitches - we've deployed ROAs for all our prefixes now, so we don't care about either of these errors any more!<br />
Hopefully they in turn will open a ticket with their router vendor to figure it out.<br />
<br />
<h3>
Glitchiness as a "feature"...</h3>
A few days back (after writing the above), I asked for some feedback about the odd glitches with RPKI we'd seen from the ISP.<br />
<br />
They were kind enough to reply with the following illuminating information (slightly edited):<br />
<blockquote class="tr_bq">
<blockquote class="tr_bq">
<i>There isn't so much a document for this 'bug' It's more of their implementation results in undesirable effects in our setup.</i></blockquote>
<blockquote class="tr_bq">
<i>There are 2 main issues:</i></blockquote>
<blockquote class="tr_bq">
<i>1) Cisco routers will prefer routes with a status of “Valid” to those with a status of “Not Found”. This behaviour does not appear to be configurable. (In our case we want to treat these routes equally and only drop "Invalids")</i></blockquote>
<blockquote class="tr_bq">
<i>2) When validation state checking is enabled by establishing an RPKI-RTR session with a cache, only those routes learnt from an eBGP neighbour are evaluated, with all iBGP-learnt routes automatically marked “Valid”</i></blockquote>
<blockquote class="tr_bq">
<i>The implication of this is that as a route is received over an eBGP session on an ASBR, the route is correctly marked with a status "Not Found". However once this route is transmitted over iBGP to the rest of the network, they are then marked as "Valid" the the recipient iBGP speakers and since the routers will prefer "Valid" over "Not found", the undesirable routing path is selected.</i></blockquote>
</blockquote>
So that pretty much mirrors what we saw. 1 is to be expected (and is, once you dig in, documented as noted above), but 2 is a little surprising, and seems to be what caused our traffic to die a horrible death, and is, as far as I can see, responsible for "phantom validity" as discussed above.<br />
<br />
<h2>
The Big Fix</h2>
All you have to do is publish a ROA for your prefixes. Simple!<br />
<br />
Obviously this assumes a) you have a BGP AS and b) you have a provider independent allocation of IPv4 and/or IPv6 address space. You simply use the tools your RIR (<a href="https://afrinic.net/resource-certification">AFRINIC</a>, <a href="https://www.ripe.net/manage-ips-and-asns/resource-management/certification/resource-certification-roa-management">RIPE NCC</a>, <a href="https://www.lacnic.net/1151/2/lacnic/rpki-faq">LACNIC</a>, <a href="https://www.apnic.net/get-ip/faqs/rpki/">APNIC</a>, <a href="https://www.arin.net/resources/manage/rpki/roa_request/">ARIN</a>). It's very straight forward (once you jump through any RIR hoops), and I can't see that it has any downsides, so long as the ROAs you publish actually reflect the way you announce your address space. Obviously, if you don't do BGP or "own" a PI resource, there's not much you can do about it, but your ISP may be able to help. Typically, by the time you're doing BGP, you've got PI space and (at least) two upstream providers, so your RIR will be where you'll go to set up ROAs.<br />
<br />
There are numerous guides out there that will tell you a lot about these, but my suggestion is:<br />
<ol>
<li>Create one or more ROAs (you might want to manage different address families, or different blocks of IP space, independently) - but ensure that you keep the largest prefix in the same object as the smaller ones. You can have multiple address families and prefixes in one ROA, too. </li>
<ol>
<li>i.e. if you have a /16, publish a ROA for the whole /16, plus within the same ROA, any smaller prefixes you will actually advertise (rather that using maxlength /24 to allow you to dis-aggregate whenever, which is not always clever). </li>
</ol>
<li>Match your ROAs to the <i>route</i> and/or <i>route6</i> objects you're publishing in your <a href="https://en.wikipedia.org/wiki/Internet_Routing_Registry">IRR</a> records - start by getting your <a href="http://www.irr.net/docs/rpsl.html">IRR records right</a>, and then replicate that same policy in your ROAs. </li>
<li>Make sure you keep both up to date with the routes you're actually exporting. </li>
<li>Put the expiry date of your ROAs into your calendar, and renew them before they expire! </li>
</ol>
If you have a crazy fast need for automated changes (like maybe you're a massive global enterprise, and you need to have a really flexible routing policy - like perhaps a massive cloud hyperscale company, or transit provider) there are ways of delegating a RPKI CA-like function to your organisation so you can completely control the ROA generation process in house (but you'll probably need $$$$$ cryptographic <a href="https://en.wikipedia.org/wiki/Hardware_security_module">HSMs</a> to do it right) - <a href="https://www.arin.net/resources/manage/rpki/delegated/">RPKI Delegation</a>. There are even existing tools to help you do this, like <a href="https://nlnetlabs.nl/projects/rpki/krill/">Krill</a>.<br />
<br />
Once you've jumped through your RIR's RPKI ROA hoops, this will mean that your intentionally announced prefixes will be covered by real, valid ROAs, so the normal bestpath selection algorithms will "win", and you'll get the desired results. They propagate surprisingly quickly; several sites will let you check the status of RPKI records for specified prefixes.<br />
<br />
RPKI ROAs are a really "low harm" (and to be honest, really low effort) thing to deploy (which is good, because it's going to mean it's much easier and faster to get this done at global scale, for the good of all).<br />
<br />
With the interesting effects noted above, as ISPs and IXPs increasingly to RPKI OV, you're going to have a better time if you do have a ROA than if you don't! AKA do it sooner rather than later.<br />
<br />
<h2>
What's <i>not</i> fixed in this shiny new ROA world? </h2>
So RPKI Route Origin Validation does not solve all of our problems (on a global basis).<br />
<br />
The main thing that isn't really totally solved are malicious hijacks - people who intentionally try to grab your traffic.<br />
<br />
However, some basic AS PATH sanity checking and ROA validation certainly should help stop "accidental" route hijacks and leaks. Last month, <a href="https://www.zdnet.com/article/for-two-hours-a-large-chunk-of-european-mobile-traffic-was-rerouted-through-china/">a glich cause chaos in Europe</a>, and there were prior cases where Pakistan blackholed YouTube, and an ongoing list of these things, and not long ago, the infamous case where Verizon didn't filter nonsense from clients and <a href="https://blog.cloudflare.com/how-verizon-and-a-bgp-optimizer-knocked-large-parts-of-the-internet-offline-today/">broke Cloudflare</a> (amongst other things). The latter would probably have been mitigated, so long as the leaked prefixes were smaller than those covered by ROAs. A lot of this is really up to the "edge" of highly connected networks (i.e. ISPs need to filter their customers). It's WAY harder to do this in the <a href="https://en.wikipedia.org/wiki/Default-free_zone">DFZ</a> - but anyone who sells access to this ought to filter their customers...!<br />
<br />
One critical problem it <b>doesn't</b> address is whether a router that claims to be AS XYZ is <i>actually</i> a router belonging to AS XYZ! I could, for instance, run a router at some random spot on the Internet, and (if my peers are not all that fussy) simply declare it to be a member of AS XYZ and start originating routes - or more sneakily, pretend to be your ISP and fake origin routes from you (a related problem to direct origin AS spoofing). If I match the origin AS to the prefixes you actually use, then these routes may even be marked as RPKI Valid!<br />
<br />
So, what we need next is some way of proving an origin router is actually legitimately a member of that AS. This is a Hard problem, because routers don't really have any good way of figuring out if some IP address waaaaay on the other side of the world is a valid member of a particular AS, and that the AS PATH hasn't simply been spoofed to look "right". <br />
Normally, you'd end up having to have a "handshake" between both endpoints (and arguably, every router in between) to prove routers are what they claim to be - this could be incredibly taxing to the global routing infrastructure (and it's<i> not</i> a feature that exists). It's way more work if literally every router on the Internet might come up to you and do the equivalent of a TLS handshake - which is the real "death knell" to this (there are other complications too, of course). So the "obvious" routes don't scale to Internet size (let alone the potential DDOS abuse factor). Which is, presumably, why we've not made much progress here.<br />
<br />
An alternative might be something like BGP Path Validation. IRR based filters can, to some degree, already help with this (you can publish a policy in WHOIS of what valid BGP paths should look like with as-path). Obviously, a determined attacker could spoof that, too, and they merely need to spoof a long enough AS-PATH before their own AS before they can get your traffic... Of course, longer AS Paths are less desirable! Of course, this doesn't scale well. Is a path AS4 AS3 AS2 AS1 valid? You're not a customer of AS 3 or 4, who are massive "tier one" ISPs. The only really definitive statements you can make are you AS and immediately upstream ASs (i.e. your direct Peers).<br />
<br />
It is likely that the only way we're going to "fix" these kinds of problems is a lot of work by ISPs, IXPs and mega-transit providers in sanity checking the routes they're learning from peers of various sorts. For <a href="https://www.manrs.org/ixps/">IXPs</a> and <a href="https://www.manrs.org/isps/">ISPs</a>, <a href="https://www.manrs.org/about/">MANRS</a> is a good start to this. Obviously, this requires more cooperation, but leverages existing things (contractual or trust relationships) that already exist.<br />
<br />
One final "problem" (conspiracy theory?) is that because the RIRs control the TALs, theoretically, the jurisdiction in which the RIR operates can "seize" anyone's resources (i.e. by messing with the ROA certificates). I suspect this is A) highly unlikely and B) the "ignore invalid" knob paired with import filters on prefixes hijacked by rogue RIRs (or their host country) will work around it in minutes/hours as news spreads that this has happened - not a valid concern IMO - if they break RPKI that badly, operators will stop using it.<br />
<br />
<h2>
What else can you do? </h2>
Whilst you're thinking about securing your routing, what other steps can you take?<br />
<h3>
Sane export policy - the most important step</h3>
As an "end user" network, you can also help to ensure that you don't inadvertently leak absolute nonsense onto the Internet with suitably strict export filters. Whilst you're checking your export policies, make sure they're as compact as possible (don't announce smaller prefixes if you don't have to - the larger and more contiguous your prefixes, the smaller you help keep the global routing table) - i.e. you don't actually have separate routing policies in place for parts of your /16, advertise as a single /16, not 256 /24s - and don't leak what is your IGP into your EGP - aggregate your routes! Make sure you can't announce things like bogons or prefixes you don't own. There is plenty written elsewhere about good export policies, and if you're exposing the world to your BGP configuration, you owe the Internet a few hours (at least) of learning on the topic to make sure you're not b0rking it.<br />
<br />
Just because your ISP claims to filter your prefixes doesn't mean they'll get it right (everyone has bad days) - and the more checks and balances there are between network engineers having a bad day (or router software glitches) and the global internet routing table, the better. Spend the time making your BGP export policy configuration robust and "fail safe".<br />
<h3>
If you're not already using an IRR, publish route / route6 objects</h3>
Use your RIR's tools to publish route and/or route6 objects; other organisations can use these to build better filters. Some ISPs (particularly those doing MANRS in earnest) will insist these exist before they'll accept your routes. Ensure you understand what you're doing before you mess around with this! This post is long enough without going into <a href="http://www.irr.net/docs/rpsl.html">IRR records</a> in detail... :)<br />
<br />
<h3>
Don't export itty bitty routes for no reason</h3>
If you're announcing a prefix that is Provider Aggregate (PA) space from your ISP back to them (to establish a BGP route for that prefix, for instance), add the NO-EXPORT BGP community attribute to that/those prefix(es); obviously, don't do this for Provider Independent (PI) space, because you probably <i>do</i> want them to export that for you...<br />
You can assume they've got a covering prefix for their own space that will result in the Internet maintaining reachability to you - just without the nonsense of some small part of their overall allocation polluting global routing tables for no good reason. So if they've allocated 10.2.3.0/24 to you, and they announce 10.0.0.0/8, you're good to no-export this prefix on your edge router(s).<br />
Rule of thumb: If a prefix is only announced by one ISP, and their routing policy is no different to yours, and it's PA space, make sure their routers don't announce it any further by using a no-export community on that prefix when you export it. <br />
<br />
If you have PI space, announce <i>all</i> your prefixes to <i>all </i>your Peers/ upstream ISPs (unless you have a compelling reason not to), but be as compact as possible (the only reason to split up a large block is because you have different routing policy for some smaller blocks for e.g. traffic engineering purposes, or restrictions on particular kinds of traffic over some ISP links, etc).<br />
<br />
We've recently started doing just this, and the prefixes the global Internet sees from us <a href="https://bgp.he.net/AS37520">no longer contain PA space</a> (as it should be!) - we've gone from announcing 10 IPv4 prefixes to 4; and from 5 IPv6 to 2. That's 6 fewer routes in the global IPv4 table, and 3 fewer for IPv6. Almost all of that was though adding no-export to PA space we're using (and withdrawing announcing one /21 prefix we no longer have separate routing policy for).<br />
<br />
You can get this really wrong...<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEigoKEy9ZH21lesCpXENHcfBc3gQ2VYU1PQNFMD8HKZW6VTC5g7tWb3hYLYYe1TXdR8fqaZUWsWTxPIiK9Xkr-7CLEOJZqGma2eFJ4mgeRernrVOS6PMPTUDDu9OH_LE9SiOUXCz7dWTF0/s1600/HowNotToAdvertiseSingleHomedLegacyIPv6.PNG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="528" data-original-width="794" height="212" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEigoKEy9ZH21lesCpXENHcfBc3gQ2VYU1PQNFMD8HKZW6VTC5g7tWb3hYLYYe1TXdR8fqaZUWsWTxPIiK9Xkr-7CLEOJZqGma2eFJ4mgeRernrVOS6PMPTUDDu9OH_LE9SiOUXCz7dWTF0/s320/HowNotToAdvertiseSingleHomedLegacyIPv6.PNG" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Here's how NOT to advertise routes when you're single homed...!<br />
That's *55* routes, which could be summarized as just 4 - with <i>no difference to the organisation</i>.<br />
I've obscured the origin AS Number, but it's another University in South Africa. It looks like they've "leaked" their IGP into BGP. Data from bgp.he.net.</td></tr>
</tbody></table>
There are, absolutely, valid reasons for splitting up your address allocation (usually for traffic engineering, or because the routing policy between different prefixes differs), but for most organisations, most of the time, what you want to route is your entire prefix; don't clutter the global routing table unnecessarily (particularly if you have a honkin' great big chunk of Legacy space). It kind of irks me that because of binary trees, I need to advertise 3 prefixes from a /16 when I actually want to only advertise 2, but there's a limit to what CIDR can do in aggregating on bit boundaries (we use the top 1/4 of the space for different purposes than the bottom 3/4; subnet maths means that's most compactly a /17 and two /18s)<br />
<h3>
<br />Insist on MANRS and RPKI OV</h3>
You can also ask your ISPs or IXPs if they're doing anything about <a href="https://www.manrs.org/isps/">MANRS</a> or RPKI OV - and even build that into your purchasing decisions (put it into your RFPs as a bid specification item!).<br />
<br />
Make sure your ISP(s) have a BGP <a href="https://en.wikipedia.org/wiki/Looking_Glass_server">Looking Glass</a>...! As someone who runs a network using BGP, it is useful to know a couple of public ones on top of any your ISP may run (ISPs are useful to figure out what they're doing with your prefixes; others are useful for views of what the global Internet is doing with them).<br />
<br />
A good sign is when your ISP publicly lists the BGP communities they make use of (which are quite handy in BGP policy, both on import (which routes you accept and use) and export (which prefixes you advertise and how).<br />
<br />
At the end of the day, ROA helps you against accidental (or unsophisticated intentional) prefix hijacks by 3rd parties. However, protection against intentional spoofing will require much more work by ISPs, IXPs and global transit companies ("Tier One ISPs"). The more ISPs and IXPs are strict about MANRS and RPKI OV, the more effective they'll get. At some stage, it will reach a tipping point, and NOT deploying these things will be a problem.<br />
<br />
<h3>
Monitor your prefixes on the global Internet</h3>
You can subscribe to services that will alert you to BGP routing changes, like <a href="https://bgpmon.net/">BGPMon</a> to help you see any "funny business" that might happen to your prefixes - importantly, this is an "internet scale" view.<br />
<br />
<h3>
Do your own prefix validation</h3>
If you want to filter prefixes for validity (say, for example, your upstream ISP doesn't), you can get your edge routers to do so. Find a reliable 3rd party validator you trust, or run several of your own. There are good guides available online; Juniper, for instance, has a "Day One" guide that steps you through everything you need to know.<br />
<br />
<h3>
Some other quick wins</h3>
<br />
<ul>
<li>If you peer at INXs, make use of <a href="https://www.peeringdb.com/">peeringdb</a>. </li>
<li>If you're really big, and don't want to look like a tit (hello, <a href="https://blog.cloudflare.com/how-verizon-and-a-bgp-optimizer-knocked-large-parts-of-the-internet-offline-today/">Verizon</a>), have a responsive 24/7/365 NOC, and consider <a href="https://en.wikipedia.org/wiki/INOC-DBA">INOC-DBA</a>. </li>
<li>If you're quite small, and don't run an 24/7/365 IT support operation, make sure your upstream ISPs DO have 24/7/365 NOCs - and that they have contact details for<b> your</b> network engineering team; answer the call if it happens at 3am...! If you're doing BGP, it's serious business.</li>
</ul>
<h2>
Finally... </h2>
It's *very* hard to secure the "core" of the Internet, but it's quite easy if you start at the "edge". Secure your edge, and help to get ISPs and IXPs to secure theirs (and tell your friends to get on the same bandwagon).<br />
<br />
It's really, really helpful if your ISP has a responsive support NOC (the ISP mentioned here does, and we like that a lot; of course, you get what you pay for, and there are a <i>lot</i> more zeroes involved in monthly invoices than with a typical SOHO ISP!); you can get a lot of investigation done if they have a Looking Glass (i.e. you can probably tell them what's broken to save them time!).<br />
<br />
<h2>
More reading</h2>
<br />
<ul>
<li>Cloudflare have a pretty good overview of everything RPKI - both an overview of some key problems with BGP, the need for secure routing, and RPKI. You can find it here: <a href="https://blog.cloudflare.com/rpki/">https://blog.cloudflare.com/rpki/</a></li>
<li>Juniper's Day One guide "Deploying BGP Routing Security" covers all the essentials if you're thinking of doing Validation on this platform. <a href="https://www.juniper.net/uk/en/training/jnbooks/day-one/deploying-bgp-routing-security/">https://www.juniper.net/uk/en/training/jnbooks/day-one/deploying-bgp-routing-security/</a></li>
<li>BGP is fascinating at internet scale. The Internet Peering Playbook will open your eyes to how the global internet actually works, and the kinds of games larger ISPs will play! <a href="http://drpeering.net/HTML_IPP/ipptoc.html">http://drpeering.net/HTML_IPP/ipptoc.html</a></li>
</ul>
James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com2tag:blogger.com,1999:blog-5949954951402579361.post-86424010496739437402019-05-10T20:37:00.000+01:002019-05-13T11:01:31.376+01:00On the State of Firewalls: are NGFWs (becoming) obsolete? Between the last blog post and this one, I’ve moved from K-12 into Higher Education, at the first place in Sub-Saharan Africa to have Internet connectivity. This is a vastly different environment in some ways – in particular, firewalling is quite different. You’re dealing with a user population that is entirely adults. Some of those adults engage in legitimate research on things that some would consider a bad idea (malware) or “morally dubious” (porn, pop-up ads, etc.), or needs unfiltered traffic (network telescopes, honeypots, big data “science DMZs” ). The particular University I work at has generally had a liberal outlook with regards to personal freedoms (and concomitant responsibility) – I think that’s generally a good thing and exactly where higher education should be.<br />
<br />
We’re currently looking at doing a hardware refresh of our ~7 year old enterprise firewalls – mainly because the support on the current solutions is eye-watering. The present solution works fine (although it has quite limited capacity for logging – about 8 hours of our traffic), and it’s approaching vendor EoL status. Interestingly, even moving to a newer (and, because Moore’s law, more performant) hardware platform from the same vendor saves us money over a number of years. So we’re thinking about what we need, and that’s prompted some musings about the state of firewalls…<br />
<br />
<a name='more'></a><br />
<br />
Reducing privacy and security to be able to “better filter” traffic is an argument that does not hold a lot of water in some environments (like ours), and so SSL/TLS interception (<a href="https://en.wikipedia.org/wiki/Man-in-the-middle_attack">MITM</a>) is really a non-starter (for us).<br />
<br />
Virtually everything is moving across to SSL/TLS (if it hasn’t already done so). All those lovely application identification based NGFW firewall filter features? Virtually useless. “Content” filtering? Hah! Virus scanning? Nope. DLP? No, sorry.<br />
<br />
Unless you’re willing to MITM every single connection (perhaps with some careful exemptions for things like healthcare and personal/business finance), you’re probably done for - at least whilst encrypted traffic is passing through the firewall.<br />
<br />
On top of that, implemented in certain ways, features like certificate pinning and HSTS mean you’ll usually have to exempt various sites and services (Google Apps don’t like SSL interception, for instance), which reduces the utility of even doing this yet further. For example, it’s quite hard to allow enterprise google drive and not have someone exfil data through personal Drive (although not totally impossible if you can see inside every packet). If I were a cloud provider, I’d probably be thinking about selling dedicated IP addresses, like web hosts used to before SNI existed and you wanted to do SSL, so my enterprise customers had an easier life. Of course, doing this in a DDOS resilient way that scales across CDN/cloud edge is not trivial – at least without an on-premises middleware box or some sort of VPN or tunnel and a guarantee of where traffic can wind up within the cloud service provider). And of course, in the era of the smartphone (virtual “work” sub-systems on devices aside), you’re one 3G data session or coffee shop hotspot away from whatever you’re worried your users are going to do anyway…<br />
<br />
There was some temporary hope to some of this – SNI inspection of SSL/TLS certificates themselves. Of course, the Internet community have (quite correctly) decided that this ability in and of itself is a privacy violation, so we now have a draft standard that encrypt SNI (<a href="https://blog.cloudflare.com/esni/">cloudflare</a> and firefox support it, for example – see <a href="https://datatracker.ietf.org/doc/draft-ietf-tls-esni/">ESNI</a>) – so you have no way of figuring out what that TLS encrypted packet is about without MITM, decryption (or perhaps an invasive browser plugin).<br />
<br />
A move to gigantic, amorphous clouds makes whitelisting “safe” (or blacklisting “unsafe”) IPs really hard - and is of course why we used to look inside the unencrypted packets to find out exactly where that request was going to/coming from.<br />
<br />
(Warning: million dollar ideas ahead.) Without MITM, that pretty much just leaves DNS as a place you might be able to exert significant control; the way DNS works is going to make implementing ideas like ESNI hard, so it's a fairly long-term bet. I would therefore not be at all surprised if a hardware firewall vendor will soon suggest that you make your firewall(s) your client's recursive DNS server(s). It’s not much of a jump, software wise, to turn content filtering lists of domains/URLs and pattern matches into DNS software that returns NXDOMAIN for things you don’t care to allow (or conversely, only resolves those you choose to whitelist, and otherwise forwards you to a captive portal with an error message). Those people who use DNS (like OpenDNS/Umbrella) to implement filtering are now arguably ahead of the game (for the time being). Of course, unless objectionable things are a) within a “boundable” list of IP addresses that are b) dedicated to that function, your clever users’s easiest workaround is simply to use the IP of the service they want, bypassing DNS entirely, with more advanced users hacking their hosts file. A “stateful” approach to that concept might be to include functionality like “only allow requests that pass our stateful filter ruleset, AND which have a recent (within TTL since lookup) corresponding DNS lookup, resolving that dst IP, that wasn’t NXDOMAIN from that client; OR are part of an existing, authorised connection”. Oh, and if you allow VPNs and/or client DNS traffic out from your network, well, good luck with that. Of course, once you mess with DNS, you will inevitably get some ICT researcher being understandably grumpy…<br />
<br />
And of course, people are breaking DNS, too... <a href="https://en.wikipedia.org/wiki/DNS_over_HTTPS">DNS over HTTPS</a> and <a href="https://en.wikipedia.org/wiki/DNS_over_TLS">DNS over TLS</a>. I don't think it's unreasonable in an Enterprise to insist your DNS servers are used and to block other services. Things on "public" and even "guest" networks are, arguably, rather different.<br />
<br />
Of course what firewall vendors all currently say is “you need to MITM your traffic; look at our expensive, shiny ASICs that accelerate that”. *facepalm*<br />
<br />
So, in the absence of magical DNS hacks, what do modern networks that can’t or won’t implement MITM need? Well, it’s back to the 1990s or early 2000s. Stateful firewalls, with vendor maintained lists of naughty (and perhaps "nice") IP addresses. This suggests that, at least for any organisation that’s not willing to do MITM, you need a more modest set of boxes (or that more modest firewalls can presumably handle way more traffic at close to multigigabit line speeds), and a much more modest service subscription – basically, they should supply a list of IPs that allow you to drop connections to/from IPs hosting e.g. malware C&C, and any IPs that are just “naughty” for other reasons (within your existing threat/content categories). You’re back to matching on tuples of protocol, src and dst IP, and src and dst ports, with some address lists, and use of connection state (new/established/related/invalid). If some really good open source/crowdsourced shared resources of data for address lists appear, it’s going to make justifying spending big bucks on traditional enterprise firewall vendors hard work (particularly if you’re FUD resistant).<br />
<br />
For those of you that have to be able to content filter (because <i>think of the children</i>!) or do e.g. DLP for compliance reasons, you’re almost certainly going to have to MITM traffic or do some really draconian “whitelist only” filtering; this has copious downsides to it, of course, but arguably remain the only realistic current options to MITM at the network “border”.<br />
<br />
So, if you are implementing MITM, NGFWs will continue to work quite well for you.<br />
<br />
If you’re not, well, their days are very much numbered.<br />
<br />
This (perhaps simplistically) suggests that enterprise networks need to behave or be treated more like transit ISPs – they carry your packets regardless of what they are - unless they’re demonstrably “really bad”, in which case, you’ll find them blackholed in some way - and you don’t trust them. All other filtering and security rests in the hands of your apps, which requires a shift in thinking within the enterprise about how security is achieved. NAC needs to keep bad actors out of your LAN, or at least mitigate their threats; identifying users and devices within your network is increasingly important. The network edge is inside your application now.<br />
<br />
For the necessarily paranoid, going back to running things as if you still used enterprise mainframes with dumb terminals (and strip searching people for cameras) might be required and somewhat effective (think SCIF). Right up until you encounter an in-house adversary with an eidetic memory, or your enterprise app can be run on something that can take screenshots… Of course, none of that is realistic in a “modern” enterprise, and most don’t need quite that level of paranoia.<br />
<br />
Defence-in-Depth, “no perimeter” or “borderless” modes of thinking and design are increasingly imperative, and the challenges of BYOD multiply. Not all of those challenges are technical – many of them are policy, enforcement and training related.<br />
<br />
Fun times.<br />
<div>
<br /></div>
James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com0tag:blogger.com,1999:blog-5949954951402579361.post-8984059231674104662018-06-04T10:03:00.001+01:002018-06-07T07:42:28.339+01:00On PABXs...We eventually decided that our existing PABX solution no longer met requirements. In particular, it was very opaque, and extremely expensive, with a licensing model (and vendor locked-in handset requirements) that were punishingly expensive.<br />
<br />
We considered "DIY" with something from Yealink (most likely an <a href="https://www.yeastar.com/s-series-voip-pbx/">S300</a>), with an interim set of BRI and POTs links to move away from Avaya onto another platform, before investigating SIP trunks for the "uplink", but we looked at our work schedules (and the offers that came in) and we ultimately decided to get a VOIP telecommunications company to help us install it, implementing SIP from launch.<br />
<br />
Here are a few things we learned along the way...<br />
<br />
<a name='more'></a>We considered a number of options, including carrying on the way we were. However, the existing system (an Avaya IP Office 500 system) is/was extraordinarily expensive, and the SIP trunk offerings from some of the suppliers were quite good.<br />
<br />
It turned out that VOIP call rates (and line rentals) are <i>so</i> competitive (compared to incumbent call rates and BRI line rentals) that we could pay off the change in system in a few years, which certainly made persuading the finance part of the school easier.<br />
<br />
One of the potential vendors, who had supported the partly broken Avaya system for several years, immediately started dishing out <a href="https://en.wikipedia.org/wiki/Fear,_uncertainty_and_doubt">FUD</a>, at which point we lost much interest in carrying on any further relationship with them.<br />
<br />
It was something of a challenge to find someone who wanted to give us SIP trunks on our own bandwidth - many insist on laying fibre to the premises - but then you point out where you live, and they look confused, as they never have fibre to this town. Eventually, we found a fairly major national company willing to meet our requirements.<br />
<br />
We ultimately opted for a system (<a href="http://www.farsouthnet.com/products/com-x-range/com-10x/">Com.X10</a>) from <a href="http://www.farsouthnet.com/">Far South Networks</a>, a local company that basically "puts Asterisk in a box", along with a few different Yealink handsets (<a href="https://www.yealink.com/products_24.html">T23G</a> as a basic model, <a href="https://www.yealink.com/products_25.html">T27G</a> for power users, <a href="https://www.yealink.com/products_27.html">T29G</a> for switchboard with some <a href="http://www.yealink.co.za/products/others/accessories/expansion-module-exp20">EXP20 modules</a>, <a href="https://www.yealink.com/products_81.html">W52P</a> for people needing mobile handsets and <a href="https://www.yealink.com/products_35.html">CP860</a> as a conference phone), implemented, supplied and supported by a major national VOIP telephony provider.<br />
We also got an <a href="http://www.farsouthnet.com/products/ita/comma-ita-intelligent-telephony-adapter/">iTA</a> with the requisite cards, for "analogue" lines, but we're not using it in practice (it was a backup option in case the SIP trunk was an absolute failure, which it isn't).<br />
The Yealink Handsets almost all support gigabit pass-through, which is useful in 2018 in an environment where most areas still tend to have a network point ratio hovering around about "how could we <i>ever </i>need more than one jack per room?"...<br />
<br />
We liked that the PABX pretty platform agnostic - it's "just SIP" - meaning we're not locked into any particular vendor from a handset point of view (and can theoretically investigate "softphones" and apps that do SIP).<br />
<br />
We were looking forward to having a 3rd party do "most of the work" for us, which is more or less the case. Of course, there are still some things we need to "tidy up"...<br />
<br />
<h2>
Comparisons between Far South and Avaya</h2>
Obviously, the "big name" VOIP platform has a few advantages - and one very considerable disadvantage (cost).<br />
<br />
We most missed the fairly easy centralised provisioning of handsets.<br />
<br />
The IP Office Manager software provides a much "richer" set of programming/provisioning options for the handsets. One can use autoprovisioning on the Far South, but it's much more involved, and requires editing config files.<br />
We were not given admin access onto the PABX server, so we haven't experimented, but <a href="http://forum.yealink.com/forum/showthread.php?tid=1555">Yealink's configuration generator tool</a> may be useful (although it looks like it doesn't quite support the models we have "out of the box" - that may be fairly easy to fix by copying the relevant handset specific XML config files somewhere. It's available from the support page of each handset, so you would expect it to be more compatible. Autoprovisioing also comes in very handy when it becomes necessary to update firmware. Theoretically, it's possible to run a 3rd party autoprovisioning server, but when<a href="https://sites.google.com/site/comxwiki/adding-a-sip-phone-as-a-managed-sip-phone"> one exists on the PABX</a> and there's a way of doing the <a href="https://sites.google.com/site/comxwiki/configuring-blf-keys-on-managed-sip-phones">BLFs</a>, it's silly to do so. It surprised me that a major VOIP telecomms provider doesn't seem to use this modality, because it's the most scalable, sensible way of managing fleets of SIP endpoints; that may be a function of being serviced by a fairly minor branch, but I don't know. The unique handset identifier, not surprisingly, is its MAC address. Discussions with a friend who recently implemented a similar system at another site (same supplier, too) suggests there were some bugs/incompatibilities between certain Yealink handsets and Far South's autoprovisioning system - but I imagine those ought to be fixed by now.<br />
<br />
At the moment, we're stuck with connecting to the webserver on each handset to configure things from there (which can include just uploading a config file that is "correct" already - if you have a lot of the same model(s) of phones, get one right, download the config file, edit the SIP account details and any other things that change between handsets, and upload away).<br />
<br />
The other surprising "no brainer" feature that seems to be absent is an autogenerated centralised address book. With the Yealink phones, you have a couple of options (LDAP or XML files are the two front-runners for centralised provisioning) to access phone books, other than the "on phone" manually maintained option (not usually a good idea). I can't see an LDAP or XML address book I can just connect to on the Far South that basically lists the names and numbers as configured. One could, if you cracked the autoprovisioning, probably push local address books in the config files, but that is not particularly responsive to change.<br />
<br />
Fortunately, there's a 3rd party application that is fairly easy to get running that creates an easy to use web-based GUI that generates the required XML format. Octivi Labs have kindly created the open source <a href="http://labs.octivi.com/yealink-phone-book-manager-released/">Yealink Phone Book Manager</a>. Still, this means you're stuck maintaining a 3rd party system just for a working address book - and means you need to remember to update this when you change anything on the PABX. With Avaya's IPO, the address book functionality is built in, and "just works" on compatible handsets, and tracks changes to the numbering plan (and can of course also include important 3rd party numbers).<br />
<br />
<h2>
Training</h2>
Reasonably, they like people to <a href="http://www.farsouthnet.com/training/">be trained</a> on the system before letting them loose on it. However, from what I've seen of the system, the admin web GUI is very straightforward, and the backend is "just Ubuntu with some add-ons", which is not at all scary.<br />
<br />
<h2>
Routing</h2>
Another amusing thing was watching their routing announcements in action. We utilise bandwidth from an <a href="https://en.wikipedia.org/wiki/National_research_and_education_network">NREN</a> - a national consortium of educational institutions (primarily Universities and national research organisations) that pool resources to access better Internet facilities that might otherwise be the case. In this instance, it means they run/fund their own ISP, who peers around the country at multiple IXP facilities.<br />
The first peering point our traffic passes was in Durban, and the problems were all on the final hop (i.e. the SIP server itself on their end).<br />
The connection quality to their "main" SIP trunk endpoint in Johannesburg was terrible - latency from 35-1250 odd milliseconds, with reluctantly horrific jitter and ~30% packet loss. Changing to the Durban endpoint was modestly better, but I see they're now using their Cape Town SIP trunk server. Of course, the traffic still dumps out in Durban and is carried on their backhaul from there. Our ISP's network is, of course, very good (Universities make very grumpy customers). <i><a href="https://en.wikipedia.org/wiki/MTR_(software)">mtr</a></i> is a wonderful tool - far better than anything available in Windows. It's quite fun to expose <a href="http://schoolsysadmin.blogspot.com/2018/02/mpls-causes-some-weird-effects-aka-why.html">MPLS tunnels</a>, where the service provider you are routing over on that hop makes this information available.<br />
<br />
You may wish to provide the provider with information (you may need to tell them your IP address range(s), AS Numbers, and so on so they can perhaps edit their routing). It may help if you understand your own typical routes to various places, or can supply "typical" routes (it will help if they give<i> you</i> a list of SIP endpoints you can traceroute to, and supply average RTT and jitter for).<br />
<br />
<h2>
Lessons Learned</h2>
<div>
We've learnt some things that might be worth bearing in mind. Many of them are quite obvious, but they bear repeating/stating. </div>
<br />
<h3>
</h3>
<h3>
1. No matter how much stakeholder engagement you think you've done, do more. Likewise, your own prep work!</h3>
<span style="font-size: small; font-weight: normal;"><br /></span>
<span style="font-size: small; font-weight: normal;">It's amazing how much time and effort you need to put into "stakeholder engagement" - don't assume anyone else is going to do this for you. Make sure you understand the "missing/broken/desired" feature-sets, all the extensions on your campus, etc. Go around and "interview" whomever you consider (or your boss considers) key "stakeholders" in this regard (bare minimum is all people who direct calls for others, including PAs/secretaries). Do this BEFORE you issue Requests for Proposal, and certainly before you put in the finalised order. It's surprising how many things keep crawling out of the woodwork (even from people you've engaged with a lot). Declare a change freeze for a period once the order goes in. If someone forgot something, tough (make sure it's not you)! </span><br />
<span style="font-size: small; font-weight: normal;"><br />Some of this stakeholder work is likely to be "Politics"; see point 6, below. </span><br />
<span style="font-size: small; font-weight: normal;"><br /></span>
<br />
<h3>
1.b. Likewise, your own prep work!</h3>
<h3>
<span style="font-size: small; font-weight: normal;">Similar to that, no matter how much preparation you've done, do more. </span><br />
<div style="font-size: medium; font-weight: 400;">
<br /></div>
<div style="font-size: medium; font-weight: 400;">
It's almost worth visiting each and every extension to verify what is there, because there are sometime surprises (like handsets you thought were one thing when they're actually something else, or extensions that turn out to have been manually "split" over the years, sometimes between buildings!). </div>
<div style="font-size: medium; font-weight: 400;">
<br /></div>
<div style="font-size: medium; font-weight: 400;">
Make sure you have up-to-date network diagrams (physical layout, including interconnections), and can log into all managed devices. Pre-configure as much as you can (ideally during a change window). You may or may not need/want more advanced things like various LLDP features - depends on your switch support, and the VOIP handset vendor (and the implementer's preference). Find out if any changes are needed to the DHCP server(s) active on the VOIP VLAN(s) (like DHCP option 66 or others). </div>
<div style="font-size: medium; font-weight: 400;">
<br />
Identify (and eliminate) unmanaged/non-PoE switches. PoE switching is the "right" way of doing IP telephony (as are dedicated VLANs, for which you need a managed switch). PoE is perhaps "negotiable"; management of switches really isn't, in an enterprise network of any size/complexity. </div>
<div style="font-size: medium; font-weight: 400;">
<br /></div>
<div style="font-size: medium; font-weight: 400;">
PoE on switches, however, allows you to at least try to ensure phones remain up "for a while" during power outages; people are used to phones being on when the power isn't - where this is NOT the case, ensure your stakeholders understand the limitation, and what to do otherwise. Centralised power means easy centralised power backup - i.e. put a UPS on each PoE switch/stack powering phones (and WiFi APs) and you have them remain up - for a while at least. Certainly, if you have a standby generator, ensure it's long enough to cover the delay in switchover from utility to backup power. </div>
<div style="font-size: medium; font-weight: 400;">
<br /></div>
</h3>
<h3>
</h3>
<h3>
2. Running two PABXs in parallel may -or may not- be a good idea. </h3>
<span style="font-size: small;"><span style="font-weight: normal;"><br /></span></span>
<span style="font-size: small;"><span style="font-weight: normal;">If you're changing technology (i.e. going from analogue or ISDN trunk lines to SIP or other VOIP backhaul) you may have the option to run both systems partially in parallel. Depending on your environment, this may be quite attractive, in that you can implement a new system in parallel, test it, and only cut across when you're happy with it. There may be complications if you're partly replacing a hybrid PABX with some VOIP handsets, but you should be able to work around this with managed switches and perhaps some external phone power supplies and/or PoE injectors. IP address space limitations are easily dealt with through larger scopes, or additional voice VLAN(s). </span></span><br />
<span style="font-size: small;"><span style="font-weight: normal;"><br /></span></span>
<span style="font-size: small;"><span style="font-weight: normal;">Obviously, this means you have more systems floating around, and perhaps, more confusion - and probably means more than one visit to each end point. However, this multiple visit "bug" is actually a feature, because it means, once you've officially switched over, you (or other helpdesk people) can go around and make sure that all the "essential" features on the handsets are understood, and report/fix any outstanding issues. Many people are somehow bad at reading/following instructions, and "monkey see, monkey do" can be very helpful. Prepare to deflect any non-VOIP related issues that may crop up on such contacts - get them to fill in a helpdesk ticket for non-VOIP issues, that will be handled by others. </span></span><br />
<span style="font-size: small;"><span style="font-weight: normal;"><br />Make certain that at least "essential" people have both systems as soon as possible - and that "essential" endpoints are the last to have the old system removed. This is your switchboard, and any "emergency/vital" contact points (which may include IT, maintenance, healthcare, marketing/admissions/fees, other departmental switchboards, and so on). </span></span><br />
<span style="font-size: small;"><span style="font-weight: normal;"><br /></span></span>
<br />
<h3>
3. Don't assume that the installers know as much as you do about what is needed, even if you've spent ages filling out (their!) needs requirements documents for them. </h3>
<div>
<br /></div>
Make sure you assign people (or yourself, stretched really thin) to stay with the person(s) doing the<br />
central configuration - as well as to help them find all the endpoints for installation - you may need to co-opt "helpers" from the staff or student body. Make sure every on site implementation person has a reasonably "clueful" companion from your school/team.<br />
<br />
Sometimes the post-sales spec team(s), and the actual implementation team(s) aren't even the same people.<br />
<br />
If their documentation doesn't make sense to you, make your own, and that should be the "gold standard" - if you don't know what a feature is called, ask.<br />
<br />
Don't assume any feature is available just because you had it before, or have seen it elsewhere - make sure it's part of your spec documentation, and they say that it is supported (in writing) in the final order/contract.<br />
<br />
Onboard sufficient VOIP/telephony jargon that you can properly describe what you want/need. For example, if you don't know what FXO and FXS are, and you're implementing analogue stuff, learn it! If you don't know what IVR is, learn - and so on. Basically, go through spec sheets, and Google unfamiliar terms and acronyms.<br />
<br />
<h3>
4. Don't assume the install team has actually been given/read commissioning/order documentation.</h3>
<div>
It's surprising how people pitch up and ask questions you've already documented at length. Don't get annoyed, but do have clear guidelines to give them, and answers to likely questions - or have a clear understanding of where to get the definitive answer. </div>
<div>
<br /></div>
<div>
We were quite surprised how much time the implementation person had to spend on the phone to various "higher tier" support people to get things working. Eventually, it seems to have transpired that the SIP account/lines were not even correctly created by the central infrastructure team. Re-creation seemed to solve a multitude of sins (like not being able to dial out...!) </div>
<div>
<br /></div>
<h3>
5. Clear your schedules before <i>and after</i> you do this sort of thing. </h3>
<div>
In order to do steps 1, 8, and 10 properly, you need (lots of) time. We all know how busy IT helpdesks at schools can be. </div>
<div>
<br /></div>
<div>
Make sure sysadmin chaos is eliminated through change freezes (and avoiding "patch Tuesdays"). Declare (management approved) reduced support availability, and what will happen instead, and perhaps even what problems people will have to "just deal with" whilst you/IT team get VOIP sorted. You may have the luxury of a sufficiently large IT support operation that you can run a "skeleton" IT helpdesk for non-VOIP issues. This is unlikely at schools! </div>
<div>
<br /></div>
<div>
There will, no matter how good the implementation was, be outstanding things that need attention. Make sure you "block out" at least two full days each side of the implementation window in your calendar. A week each side is probably more realistic.</div>
<div>
<br /></div>
<div>
Ensure the time to attend to them - and the documentation, whilst things are "fresh" in your mind - is available. </div>
<div>
<br /></div>
<br />
<h3>
6. Don't make Politics your problem. </h3>
<div>
<br /></div>
<div>
It's amazing how much "politics" there can be between administrators whose literal job it is to answer phones. (I understand this; phones are ludicrously intrusive on all other types of work. I <i>hate</i> phones - asynchronous communication [like email, text chat, etc.] is MUCH better). Still, your organisation's external customers and stakeholders (parents, suppliers) expect to speak to a human that can actually help them quite quickly if they phone you. </div>
<div>
<br /></div>
<div>
Don't get involved. </div>
<div>
<br /></div>
<div>
Don't play favourites. </div>
<div>
<br /></div>
<div>
The correct point of view on how the system should function is that of the (external) customer - what do (prospective) parents reasonably expect? Deliver that experience as much as possible. Explain this to management. </div>
<div>
<br /></div>
<div>
Get senior management to understand what the "best practice" is, and, diplomatically(!), hint that their PA may not be giving them the best answer as to how things "ought" to be. Make sure you present this best practice "out of band" of any meeting/email where people that deliver Politics unto the system (most commonly, the PAs of senior management!) are present/recipients. Remember that many may give their PAs access to email, so that may not be the best modality to bring things to their attention...!</div>
<div>
<br /></div>
<div>
I don't recommend it (because it's often an "untruth", which makes me uncomfortable), but a new system is often an opportunity to implement the "right" way as "the only <i>supported</i> way"/"way it <i>has</i> to be now, because of system limitations/features"...! This also gives management an opportunity to "get out of" difficult politics with their direct reports... An inanimate object can take quite a lot of hate and just not care, particularly if irate users cannot throw it out of the window! </div>
<div>
<br />
If there are unpopular "ways it has to be", point out some awesome new feature they're getting as a (partial?) recompense! Commiserate about the horrors of technology. Move on with the day! :) </div>
<div>
<br /></div>
<div>
<br /></div>
<h3>
7. Number portability is awesome. </h3>
<div>
<br /></div>
<div>
There was once a time when a change in telephony provider meant all your numbers had to change. </div>
<div>
<br /></div>
<div>
No longer! </div>
<div>
<br /></div>
<div>
This feature is extremely useful, because it means that external stakeholders don't "lose" your contact information, and all your printed stationery, etc. remains substantively correct. Changing phone numbers is, in expense/trouble, <i>at least </i>as much trouble as changing your organisational branding. </div>
<div>
<br /></div>
<div>
If you live somewhere benighted that *doesn't* yet have number portability, make sure you lend your voice to any campaign to enact it through the appropriate governmental/regulatory processes. </div>
<div>
<br /></div>
<h4>
8. Document as you go</h4>
<div>
You will likely learn things as you go along. Make a note of them (in your hardback notebook) and transfer them into your team Wiki once you have a chance. Refer to your checklist(s) as implementation proceeds. </div>
<div>
<br /></div>
<div>
There are additional things you should document as policy - does a person, or a functional role, take precedence for Moves, Adds and Changes? Who can request a phone (or a fancier phone, or an additional phone)? Schools are odd as organisations (from a telephony perspective, as well as others!), because not everyone has a supplied "office phone" - they're pretty annoying in a classroom setting, and that is the "average" teacher's office environment. </div>
<div>
<br /></div>
<div>
I would say functional role should take precedence - because from an organisational perspective, people phoning number X for official business probably mostly want Role Q, not Person P. If a person who used extension X wears multiple job role "hats", this may become somewhat complex. Documentation (i.e. contact details on websites or other forms of directory) should be kept up to date, if you supply such information publicly (or privately!). This is similar to the "role based" vs "person based" email account issue, but abstraction is harder/more expensive (but not impossible) with telephony. Indeed, when there are clearly different functional roles, it may be worth "pre-allocating" external numbers or at least extensions to those separate roles, and having multiple lines on a telephone to support this. </div>
<div>
<br /></div>
<h3>
9. Make sure your own implementation team knows what is required. </h3>
<div>
Make 100% sure your "helpers" understand what "in parallel" means if you're going to run both systems together during the switch-over. Language barriers are real, and sometimes, people don't pay attention in meetings (or even read documentation) when you explain what is required. </div>
<div>
<br /></div>
<div>
Have separate "your team" and then "everyone" meetings - i.e. make sure your in-house people "get" what you're planning, and that the external implementation team is also on the same page. Provide a clear "command chain", and (somehow) find some time to make sure the distributed teams are getting it right - having a centralised deployment location with all the new gear near to you may help (but not in the same room, because that's not ideal - because it's distracting when you're doing "hard" things).</div>
<div>
<br /></div>
<div>
Many of our problems came back to having insufficient time to get this implementation planning done and communicated, and conflicting commitments as the supplier kept delaying the implementation date (on one occasion because of damage in transit of a part). </div>
<div>
<br /></div>
<div>
If you can, plan extra <i>faaaaaaar</i> in advance, so you can do changes like this in a vacation/holiday period (making sure you have access to ALL keys/buildings...!). In our case, installation sooner rather than later was desired to realise cost savings - otherwise, this would have been delayed until August - although we've been planning this move for well over 5 months already; the chosen supplier could not deliver during our last holiday period (to be fair, we only finalised this project just before that holiday began). </div>
<div>
<br /></div>
<h3>
10. Make checklists. Follow them!</h3>
<div>
Checklists make a fairly good way of making sure things go properly, and you don't forget anything important. So make them, and use them! Every single step (or group of steps) should be on a checklist, and as they are (successfully and completely) finished, they ought to be checked off. Where they are not successfully/completely finished, "red flag" them for attention/resolution/excalation. </div>
<div>
<br /></div>
<h2>
SIP Security</h2>
<div>
<br /></div>
<div>
As you can imagine, in a world where phone calls cost money (and money making premium rate lines are a thing a dubious actor might run, and direct fake calls at), SIP trunk accounts (and SIP servers/PABXs) are a tempting target; you need to <a href="https://www.networkworld.com/article/2311252/tech-primers/secure-sip-protects-voip-traffic.html">secure SIP</a>. You may want to ensure your service provider implements this, and that anyone using SIP to your PABX externally is making use of a suitable protocol (a VPN back to your home base is a good bet). A clear indicator is that the SIP URI is "sips" rather than "sip" - just like https is secure http. As with HTTP, TLS is a relatively easy (PKI complexities aside) mechanism to get this right. Even early <a href="https://tools.ietf.org/html/rfc3261">SIP RFC3261</a> covers secure use.<br />
<br />
You may need to specifically ensure that the liability for malicious calls and compromises is contractually agreed. </div>
<div>
<br /></div>
<div>
Interestingly, our SIP trunk account/PABX (the latter is more likely) got hacked/compromised and exploited within days of being implemented and picked up thousands in billable rates of illicit calls overnight (to the point some of our staff got phoned in the middle of the night to ask if that was "normal")....<br />
At least some of the traffic is related to this: <a href="https://badpackets.net/ongoing-large-scale-sip-attack-campaign-coming-from-online-sas-as12876/">https://badpackets.net/ongoing-large-scale-sip-attack-campaign-coming-from-online-sas-as12876/</a></div>
<div>
<br /></div>
<div>
This irritates me, as we were explicitly instructed by the suppliers <i>not</i> to firewall the PABX at our border. They say they've implemented a firewall on the PABX host, but I didn't see much evidence of that; perhaps they've tightened up some Asterisk/FreePBX setting(s). (They have now implemented some incoming rules, but they need to be blocking <i>outgoing</i> traffic, too, as I can see traffic originating at the PABX to weird locations...). I really suspect the PABX box itself has been <a href="https://www.urbandictionary.com/define.php?term=pwned">pwned</a> and needs to be <a href="https://www.urbandictionary.com/define.php?term=nuke+n+pave">nuke 'n paved</a>. </div>
<div>
<br /></div>
<div>
I strongly suspect the service provider also does not encrypt their SIP sessions - and even though we reserved a routable IP address for all of our legitimate SIP traffic, this perhaps was not used to limit SIP trunk access. With your SIP traffic going across the Internet, there are plenty of places someone could intercept that traffic (and plaintext credentials) - and of course there may be pwned endpoints and devices all over the place. As I can see dubious looking calls in the PABX's CDRs (Call Data Records), it seems the PABX was used in the compromise rather than the SIP trunk itself being compromised (also, I can see weird traffic in our border firewall logs). </div>
<div>
<br /></div>
<div>
You may also need to check that voicemail and IVR trees and other such features don't allow "dial-through" to "other locations". I was quite surprised to find the IVR tree allowed me to dial internal extensions, for instance (an undocumented "feature"). </div>
<div>
<br /></div>
<div>
"Defense in Depth" would suggest you might want to be making use of secure SIP across your LAN as well - dedicated VLAN or not.<br />
<br />
Asterisk security is a fairly big topic, e.g. <a href="https://www.voip-info.org/asterisk-security">https://www.voip-info.org/asterisk-security</a></div>
<div>
<br /></div>
<div>
You may also want to check that your firewall doesn't blow a hole in SIP security. SIP ALGs and "session helpers" can be quite poorly implemented. Some versions of FortiOS <a href="https://forum.fortinet.com/tm.aspx?m=143389">seem to have issues</a>. A fairly in-depth guide to SIP in FortiOS 5.6 can be found <a href="https://docs.fortinet.com/uploaded/files/3611/fortigate-sip-56.pdf">here</a>. </div>
<div>
<br /></div>
<div>
It's important that secure credentials are sent securely, both when remote sysadmin teams sent credentials, and when phones and technicians are connecting to PABXs... </div>
<div>
<br /></div>
<h2>
Looming Worries</h2>
The PABX itself is some sort of 1U device running Ubuntu - 12.x - which is of course now <a href="http://releases.ubuntu.com/12.04/">out of support</a> - even in LTS edition. This is something that you need to be wary about with embedded devices. It may be that there is an updated version coming out RealSoonNow with a new Ubuntu distribution, but we'd expect them to ship with the current available version - or be updated to it by the manufacturer. They (Far South) have their own Ubuntu repository (update.commanet.co.za), so they *may* be manually maintaining things that need patching. Of course, as/when they get onto the next Ubuntu release (presumably the latest LTS), upgrading looks <a href="https://sites.google.com/site/comxwiki/upgrading-a-com-x-or-wanderbox-device/upgrading-your-device-over-the-internet">fairly straightfoward</a>, if likely to cause some downtime.<br />
<br />
Given that these things, by their nature, need to at least partially live "on the Internet", it is a worry to have potentially unpatched systems online - particularly as the supplied configs don't seem particularly hardened. We'll probably lock down our firewall more as and when we get a chance to figure out what it needs beyond the basic SIP ports. It seems to runs shorewall, so that could also be tightened up on the host itself - either there or in iptables.James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com0tag:blogger.com,1999:blog-5949954951402579361.post-28811080687192030672018-04-26T12:12:00.000+01:002018-04-26T14:06:09.596+01:00GSuite mail gateway using Ubuntu and PostfixWhilst a lot of vendors will tell you they "support Gmail", it turns out the level of support can be... iffy.<br />
<br />
In the end, it is often easiest if you create an email "gateway" that leverages the likelihood of just about everything on your campus being able to throw SMTP at port 25 (even though that's deprecated for mail submission), alongside GSuite for Education's SMTP email relay functionality.<br />
<br />
This is particularly important once you start getting serious about email security, and that email from your domain MUST flow through particular email servers (because of<a href="http://schoolsysadmin.blogspot.co.za/2017/06/outgoing-email-security-in-2017-spf.html"> SPF, DKIM & DMARC</a>).<br />
<br />
Read on for more!<br />
<a name='more'></a>I'm going to assume you're reasonably familiar with Ubuntu - Google any steps that seem confusing, as the Ubuntu community is large, vocal and well supported - your issue <i>will </i>have been seen before and fixed somewhere!<br />
<ol>
<li><b>Install Ubuntu</b></li>
<ol>
<li>Grab your preferred version of <a href="https://www.ubuntu.com/download/server">Ubuntu Server</a>; an LTS release is a good bet for a long-running service. Install it on a physical or virtual host. </li>
<li>In the installer, run through the scripted install. For the most part, accept defaults. When it asks what to install, only select OpenSSH server - everything else, you'll add later.</li>
<li>The server will grab an IP from DHCP - make sure that IP is allowed out through your firewall so it can successfully grab any updates/packages, etc (you can change to a static IP later if you want to). It's typically a good idea to make a reservation for the IP you intend to have that machine run on (but getting the MAC address of the server's NIC before it boots can be a challenge; a LiveCD can help bridge that gap). </li>
</ol>
<li><b>Configure Ubuntu</b></li>
<ol>
<li>Once install is finished, log into the machine with the credentials you created during install.</li>
<li>Change any network settings you want to in /etc/network/interfaces - many people like to configure a static IP. Your other option is to get the MAC address of that machine (<i>ifconfig</i> eth0 HWaddr - unless of course <a href="http://schoolsysadmin.blogspot.co.za/2017/09/goodbye-eth0-hello-enp4s0.html">eth0 has gone away</a>) and put it into DHCP to hand out the "correct" IP every time. In a highly virtualised environment, most things should be configured by DHCP (other than some core routers, and possibly your DHCP servers). </li>
<li>Configure the most local mirror(s) you know about in /etc/apt/sources.list ; <a href="https://launchpad.net/ubuntu/+archivemirrors">this list</a> is not necessarily comprehensive. (Google if you're not sure how)</li>
<li>I long ago picked up the habit of using Shorewall firewall instead of the built in one. Install shoreline (apt-get install shorewall). </li>
<li>Configure shorewall as you want it (remember to make it secure! There are plenty of good examples online, but firewalls are something you need to carefully consider for yourself rather than blindly accepting someone's suggestions). </li>
<li>Install telnet (useful for <a href="http://ubuntuwiki.net/index.php/SMTP,_testing_via_Telnet">testing smtp</a>)</li>
<li>You probably don't need it, but if you find CLI-only server intimidating, <a href="http://www.webmin.com/">webmin</a> is a wonderful front-end to most things in Ubuntu. </li>
</ol>
<li><b>Configure Postfix</b></li>
<ol>
<li>Execute <i>sudo apt-get install postfix</i> (and any dependencies that crop up). </li>
<li>In /etc/postfix/main.cf, you'll probably want to have the following, replacing the bits <<i>in angle brackets</i>> with your settings: </li>
<ul>
<li>smtpd_use_tls=yes</li>
<li>smtpd_relay_restrictions = permit_mynetworks permit_sasl_authenticated defer_unauth_destination</li>
<li>myorigin = <<i>domain.tld</i>></li>
<li>mydestination = <<i>FQDN of HOST</i>>, $myhostname, mail, localhost.localdomain, localhost</li>
<li>relayhost = [smtp-relay.gmail.com]:587</li>
<li>mynetworks = 127.0.0.0/8 [::ffff:127.0.0.0]/104 [::1]/128 <<i>space delimited list of any IPs that can send Mail through it, in CIDR format - recommend internal IPs ONLY unless you set up a much more complex configuration with user authentication - which defeats the point. NB this is how you prevent randoms spamming through your system - be restrictive!</i>></li>
<li>mynetworks_style = subnet</li>
<li>smtpd_recipient_restrictions = permit_mynetworks permit_inet_interfaces</li>
<li>myhostname = <<i>FQDN of host</i>></li>
</ul>
<li>...alongside any defaults; save.</li>
<li>Restart postfix service.</li>
</ol>
<li><b>Configure GSuite for Education</b></li>
<ol>
<li>Make sure you understand any limitations and how they may apply to your site. If you're sending enough email that you're going to hit <a href="https://support.google.com/a/answer/2956491?hl=en">the limits</a>, you're probably going to need to call in a 3rd party to help you manage your "spam operation" (<a href="https://mailchimp.com/">MailChimp</a> is nice!). Alternatively, set up some other MX records on another (sub)domain for your massive email requirements, and email outside of GSuite!</li>
<li>Log into your GSuite admin control panel. </li>
<li>Navigate to the mail setup section. Apps>G Suite>Gmail>Advanced Settings Scroll down to "SMTP relay service"</li>
<li>Set up your mail relay host details, making sure you configure the correct public IP address there for your mail relay server. If you're not sure what your public IP address is, install lynx on your server, and google "what is my IP", which will return your public IP address. </li>
<ol>
<li>Typically, you're really going to be wanting to do this from a static IP address, so if you have to do this, consider that your mail gateway might well break next time you reboot your router, or your ISP changes your DHCP leased IP address for any other reason. </li>
<li>Make sure this IP is ONLY used by your email gateway server - it should be dedicated to this purpose, and not used to NAT other traffic out of, etc. </li>
<li>If you need help choosing the right settings for your site, see <a href="https://support.google.com/a/answer/2956491">Google's help page</a>.</li>
</ol>
<li>Save.</li>
</ol>
<li><b>Configure your firewall, if necessary</b></li>
<ol>
<li>Configure your firewall (on the host and on your edge) to allow outgoing email from that machine. We've set it up to use encryption through TLS, so if you block outgoing port 25, that should not be a problem - so long as you allow 587 out from the server (or the subnet it lives in) to Google. </li>
<li>You should block outgoing tcp/25 from your LAN to the Internet, particularly for client and guest traffic. (See e.g. <a href="http://www.uceprotect.net/downloads/MAAWGPort25English.pdf">this</a> for some background).</li>
<li> You may also need to set up src-NAT to appear from a specific IP (or range); I highly recommend you reserve and use a public IP ONLY for your mail server and NOTHING ELSE. Obviously, it should be the IP specified in the mail relay setup in GSuite, above. </li>
<li><i>N.B. if you are trying to set up email on a dynamic public IP address (like on a home ADSL connection) you are going to struggle. Pay for a static IP or a small block of them from your ISP. Ideally, ask them to have the PTR resolve the same FQDN as you would put in for an A record for that IP if it is publicly accessible - and</i> certainly<i> do this if you're not relaying through Google (i.e. you run some direct emails - outside the scope of this post).</i></li>
</ol>
<li><b>Test</b></li>
<ol>
<li>Test that it works. Test from the Ubuntu CLI; test from another machine on your network. <a href="https://community.spiceworks.com/how_to/11-test-email-flow-using-smtp-commands">Interactive telnet sessions to port 25 work quite well</a>; you can also of course set up an email client program to test; <i>mailx</i> will work for basic CLI email sending.</li>
</ol>
<li><b>Put into production</b></li>
<ol>
<li>Put the host into your monitoring system, with appropriate checks and reporting.</li>
<li>Point anything that doesn't support modern email standards at your shiny new mail gateway!</li>
<li>Consider restricting hosts that can use it in either (or both) the firewall or Postfix configurations. </li>
<li>Make absolutely sure random hosts outside of your network can't use it as an open relay; NAT and RFC1918 addresses are a helpful safeguard, but intentional firewalling (network edge/borders and host) and restrictions in postfix config (allow things through mynetworks - a /32 CIDR netmask is useful to whitelist a single host) are better. </li>
<li>Remember to regularly check that your server is patched and up to date.</li>
<li>Make sure if devices or services change IP that you update the firewall and/or postfix configs to match - particularly do so for things you remove where there is a risk some random host will end up getting that permitted IP down the line. </li>
<li>Another reason you WANT to limit this is that anyone with access to this can "pretend" to be any valid user by "spoofing" the From: field - this is by design, so lock down anything that is allowed to use this facility to prevent embarrassment or more serious repercussions. </li>
</ol>
<li><b>Internet Standards</b></li>
<ol>
<li>You should probably read some applicable RFCs to better understand the complexities of email on the "modern" internet. In particular, <a href="https://tools.ietf.org/html/rfc5068">https://tools.ietf.org/html/rfc5068</a>, and note that, to some extent, a "mail gateway" of this sort is against this "best practice" - what ultimately it is, of course, is a dirty hack to make things that are <i>NOT</i> modern Email RFC compliant get mail off your campus and into your email system - and then in line/compliant with other best practices like <a href="http://schoolsysadmin.blogspot.co.za/2017/06/outgoing-email-security-in-2017-spf.html">SPF, DKIM and DMARC</a>. </li>
</ol>
</ol>
<div>
Of course, it should be obvious to the reader that slight tweaks to theses settings will allow you to use Postfix to interface obsolete/buggy gear with just about any service that supports similar mail submission processes.<br />
<br />
<b>Links for further reading:</b></div>
<div>
<ul>
<li><a href="https://support.google.com/a/answer/2956491?hl=en">Google SMTP Relay</a></li>
<li><a href="https://support.google.com/a/answer/176600?hl=en">Use SMTP settings to send mail from a printer, scanner or app</a></li>
<li><a href="https://support.google.com/a/answer/6140680#maildenied">Troubleshooting SMTP Relay</a></li>
<li><a href="http://www.postfix.org/BASIC_CONFIGURATION_README.html">Basic Postfix configuration</a></li>
</ul>
</div>
James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com0tag:blogger.com,1999:blog-5949954951402579361.post-88151231946908249522018-02-20T19:05:00.000+00:002018-02-21T07:39:00.049+00:00MPLS causes some weird effects - aka Why is traceroute so much slower than ping for some hops?Recently, my attention was drawn to something Odd about our traceroutes - namely, that traceroute and ping to an intermediate host on a route could have <i>wildly</i> different values.<br />
<br />
This <i>really</i> bothered me, once I was forced to think about it.<br />
<br />
I had previously assumed (wrongly) that the unexpectedly high second hop RTTs (and similar subsequent) values across our service provider were due to low priority in processing ICMP/tracereoute packets (many routers treat these things as low priority, for various good reasons).<br />
That was a good enough "explanation" that I'd not really thought beyond that (or, it hadn't bothered me enough to get properly intrigued).<br />
And I hadn't done pings to those intermediate hosts, and compared them side-by-side.<br />
Shame on me.<br />
<br />
And sure, ping and traceroute by default use different protocols (until you do traceroute -I).<br />
But that's not it either.<br />
<br />
Maybe traceroute sends so many more packets at a time than a ping that you hit a rate limit (1 per 500ms is a rate limit on some routers)? <i>ping</i> is ~1 per second;<i> traceroute</i> fires out loads in groups of 3 spaced per hop (well, TTL increment) quite closely together.<br />
That's not it either.<br />
<br />
Maybe a firewall was breaking things?<br />
But no, that makes no sense; both in this case are ICMP Echo, and it's unlikely they're going to treat ICMP Echo to destination A differently to Destination B on the Internet.<br />
<br />
I'm familiar with a bunch of other common pitfalls with interpreting traceroutes, but this wasn't one of those.<br />
<br />
As someone who really likes networking, this should have prompted investigation long ago, but it's not bothered me enough to go work it out (aka "I had more pressing concerns").<br />
<br />
Until someone said "Explain this" and presented a side-by-side ping and traceroute with Odd Results...<br />
<br />
Then, of course, you start THINKING about the problem, and, if you're not familiar with the underlying configuration and particularly some potential configurations of service provider networks outside your own control will probably cause you to pull your hair out.<br />
<br />
So why...?<br />
<br />
<a name='more'></a><br />
Basically, what this revealed was a big (MPLS shaped) hole in my knowledge of (large) Service Provider network routing.<br />
<br />
So, back to the drawing board, and some useful background.<br />
<br />
If you're in this situation, and you do a <i>traceroute -I</i> to a host, it's effectively (but not exactly) pinging (ICMP Echo) each host along the path - but when you manually ping (which should [and does] use the same packet type and give you the same answer) you (may) get a wildly different result.<br />
<br />
This is actually two problems - a) why are there a lot of traceroute hops with more or less the same latency and b) why is ping to (at least some of) those hops so much faster?<br />
<br />
Note that using normal Unix traceroute (which uses UDP packets with increasing TTL) will give you similar results - but we want to discount any influence of protocol - so let's stick to ICMP Echo for both ping and traceroute to minimise some possible sources of difference.<br />
<br />
Now go back and manually ping each hop IP as exposed by traceroute.<br />
Compare the traceroute time values to each hop's direct ping RTTs.<br />
Anything Odd?<br />
<br />
(<i>NB you will not see this effect, or you may only see a) unless you have this specific network configuration affecting you</i>).<br />
<br />
You would expect (with some modest variability) similar results for both tests - but <i>not</i> a difference of tens of milliseconds between the two; you would also expect an increase along the path, not a whole bunch of more or less the same figures.<br />
<br />
But no, there <b>are</b> differences of tens of milliseconds involved. And a whole bunch of routers with more or less the same RTT value even though they're geographically quite widely dispersed.<br />
<br />
That is pretty weird if you understand what <i>traceroute -I</i> and<i> ping</i> do, and have a mental map of where the packets are going and the expected speed-of-light-in-glass rtt for those paths.<br />
<br />
Of course, if you really <i>really</i> understand modern networks, there are some other things to consider. Eventually, I begged for the answer.<br />
<br />
We've not verified this with the upstream ISP, but research into this seems to suggest it's really the by far the best explanation for the phenomenon:<br />
<h2 style="text-align: center;">
<i><br /></i></h2>
<h2 style="text-align: center;">
<i>MPLS (sometimes) breaks some common expectations/assumptions. </i></h2>
<div>
<i><br /></i></div>
You'll probably want to go and read a bunch more about <a href="https://en.wikipedia.org/wiki/Multiprotocol_Label_Switching">MPLS</a> before you understand all of this (I certainly need to), but basically, what it comes down to is some of these tests (traceroute) end up basically reporting the RTT from the end of an <i>MPLS tunnel </i>(LSP), whereas others (ping) sometimes do not necessarily do so. I actually stumbled across this answer, but dismissed it, as I hadn't properly considered the effects - or remembered that, hey, I'd seen some MPLS labels in some traceroutes before, and there's <i>plenty</i> of MPLS around the place these days...<br />
<br />
Here are some values, so you can see what I'm talking about.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEilYwTpXqX2MYpi0ZdWBjwkhNp7HVLOQp7C_y-dDid-rp088tmOLPGhPlcQFCFyzwpf0OAqrBSIFpcKDJj_pSccox4xAxhM7KYOIIwBxrsrMSIILd_VZsHfsFaPO_MVWSyxF5vZfFU4gRs/s1600/tracertjump.PNG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="245" data-original-width="736" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEilYwTpXqX2MYpi0ZdWBjwkhNp7HVLOQp7C_y-dDid-rp088tmOLPGhPlcQFCFyzwpf0OAqrBSIFpcKDJj_pSccox4xAxhM7KYOIIwBxrsrMSIILd_VZsHfsFaPO_MVWSyxF5vZfFU4gRs/s1600/tracertjump.PNG" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Note sudden jump to ~30ms - even though this router is just down the road (~130km away).</td></tr>
</tbody></table>
Remember, the speed of light in fibre is around 200,000km/s (fibre's higher refractive index that air or a vacuum "slows down" the speed of light) - so in ideal conditions, ~200km/ms - but that's one way, so it's half that - 100km of real world distance per ms of RTT (5 microseconds per kilometer in one direction) - or, if you like, 100km of distance for each ms of RTT time. Obviously there are various additional delays - fibre is not straight point-to-point, there are processing delays in routers, and so on. But it does place a bound on "reasonable" values to expect (or unreasonable ones to reject). <a href="http://www.m2optics.com/blog/bid/70587/Calculating-Optical-Fiber-Latency">More accurate figures </a>are possible, but this 100km/ms is a memorable figure for the average network tech - even if the physics people squirm a bit. (This reminds me of the <a href="https://www.ibiblio.org/harris/500milemail.html">500 mile email story</a>).<br />
<div>
<br /></div>
<div>
I can live with 4ms to PLZ and back. But 30ms? Seems like something is "broken":<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEifjvYPOs7ZOyE8qUG7DS98vzYXDSQUHGMJJGp98Ig1SLM3exXeV1tpWs3p0m7jIzOB7zYg6LjxRVPDzbv3dEfApc0WTsRu3fMuktQfpigjgLfzy4y4k9GBHgnG4_Hrr0-3aAhjrliG4p8/s1600/pinglow.PNG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="174" data-original-width="561" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEifjvYPOs7ZOyE8qUG7DS98vzYXDSQUHGMJJGp98Ig1SLM3exXeV1tpWs3p0m7jIzOB7zYg6LjxRVPDzbv3dEfApc0WTsRu3fMuktQfpigjgLfzy4y4k9GBHgnG4_Hrr0-3aAhjrliG4p8/s1600/pinglow.PNG" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Here's a ping to the same hop. Only ~4ms. <br />
That's like an order of magnitude difference from the previous result, but more plausible for something ~130km away.<br />
That is CRAZY.</td></tr>
</tbody></table>
~34ms vs 4ms difference?<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg9s9q0bMFmX7h-hV19koVt8NJI2jAwsdu37Sz9piwWQajIh22OPYBjJ_ASrWBZPHTQjjWbOrelFqBtmkcUXYPJp1shcdcV3OdmVInVq8CYB1OMxctlqMfoyF0ybw_4sWLNuodvqnZjvVY/s1600/preuzmi.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="375" data-original-width="500" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg9s9q0bMFmX7h-hV19koVt8NJI2jAwsdu37Sz9piwWQajIh22OPYBjJ_ASrWBZPHTQjjWbOrelFqBtmkcUXYPJp1shcdcV3OdmVInVq8CYB1OMxctlqMfoyF0ybw_4sWLNuodvqnZjvVY/s1600/preuzmi.jpg" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><i>This picture about sums it up.<br />At least until you understand what is going on.</i></td></tr>
</tbody></table>
<h2>
</h2>
<h2>
Digging into MPLS</h2>
<div>
<br /></div>
So why do we even <i>want</i> an MPLS tunnel? <i>How dare you break my </i>traceroutes!<br />
<br />
MPLS has some interesting features for service provider networks (and large corporates - only one of which is L3 MPLS VPNs), which you should go and read about. I promise they're (in the end) useful to you as a customer, and you're probably using them, often without even knowing about it.<br />
<br />
<i>MPLS tunnels</i>, in particular, mean that your service provider (or you in a huge network) can use less "powerful" routers (ones that can't do BGP or aren't big enough for a full internet route table) to whack customer packets across their network in interesting ways. MPLS is quick and speedy at doing this sort of thing, and is particularly useful if you want a "BGP Free MPLS Core" - which you might well do for reasons involving saving huge amounts of money, or some interesting traffic engineering options. You can then have provider routers (P) that only care about MPLS labels (aka "label switching routers" or LSRs) and have no idea about routes to all possible src/dst IP addresses wanging packets about the place for you. Provider Edge (PE) Routers are more likely to have a more complete (or even full) BGP routing table. Effectively, by design, one bonus is that switching MPLS labels is more like switching Ethernet frames than routing packets - it's likewise faster and requires cheaper hardware for the same line rate throughput (compare the cost of a 10Gb/s swich with a 10Gb/s router - I can get a 16 port 10Gb/s switch for about 10% of the price of a 24 port 10Gb/s L3 switch - this will be similar for routers). There are other costs (you usually have to manually create the tunnels by defining LSPs, for instance, but when you're talking about saving [several] zeroes on [many] routers, or using some other "must have" feature of MPLS, the Network Architect pain pays off). This also makes me think that SP-track network certs will be interesting to study.<br />
<br />
<a href="https://forums.juniper.net/t5/Routing/what-does-quot-icmp-tunneling-quot-mean-in-mpls-vpn/td-p/164284">https://forums.juniper.net/t5/Routing/what-does-quot-icmp-tunneling-quot-mean-in-mpls-vpn/td-p/164284</a> in particular has some very key things to help understand about what is going on. That diagram about half way down in particular is a major "a-ha!" moment.<br />
<br />
Here is an even clearer visual:<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhYQY9hr3I_kDEm5fdk-c1J3jVJ5CH7efr3U5T1JedxvjjlUhon_q6WseZqorXWmw6CdEyLB9deShJnUj5CnlBcuIQhkoPCzBVk_OvwCFHbmLlHEPXi_-K4iDryTdoB3TCoV2WoZNkSAFM/s1600/mplsicmpresult.PNG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="465" data-original-width="631" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhYQY9hr3I_kDEm5fdk-c1J3jVJ5CH7efr3U5T1JedxvjjlUhon_q6WseZqorXWmw6CdEyLB9deShJnUj5CnlBcuIQhkoPCzBVk_OvwCFHbmLlHEPXi_-K4iDryTdoB3TCoV2WoZNkSAFM/s1600/mplsicmpresult.PNG" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Shamelessly stolen from the excellent<br />
https://www.slideshare.net/RichardSteenbergen/a-practical-guide-to-correctly-troubleshooting-with-traceroute</td></tr>
</tbody></table>
From a customer-side network engineering point of view, tunnels of this type make Internet routing/paths (in parts) almost as hard to figure out (opaque) as L2 paths - and you can't exactly get into some random router on the Internet to troubleshoot (although on your LAN, you could figure out L2 paths by looking at MAC tables in your switches). So it's certainly something to think about. But you don't necessarily have to <i>care</i> about it - but remember to<i> account for it</i> in, for example, odd results like those shown. The ones we've seen here are semi-transparent - but some configurations can completely hide the LSRs in the middle - you might just notice a surprisingly long "hop" at some stage, but think nothing of it "oh, that must be a long distance link" (and, effectively it is, but it's not a light path - it's a label switched MPLS route with several LSR P routers involved).<br />
<br />
Depending on the options your ISP (or some other network along the path) uses, this may or may not be a feature you see - MPLS can entirely hide topology from the end user's perspective, and for most people, there's no issue with that. However, if they elect to use <i>icmp tunneling</i> features, then hops can be seen, but those hops may struggle to route the packets efficiently - you get a good idea of the path, but the timings look odd.<br />
And this is ultimately what we suspect is happening here.<br />
I'm guessing we're seeing at least 2 MPLS tunnels there; hosts in what I think are the 2nd tunnel consistently give ~30ms direct ping results too, whereas the others in "tunnel 1" give steadily incrementing rtts - up until the router that goes back down to ~15ms - after which both ping and ICMP traceroute are ~30ms.<br />
To my knowledge, our traffic for 8.8.8.8 normally goes from here, to PLZ, to CPT, to JNB to get to Google's GGC in Johannesburg. I suspect there are Traffic Engineering (TE) MPLS tunnels from our local PoP to Cape Town, and from Cape Town to Joburg.<br />
<br />
But hang on a minute, <i>why doesn't this (always) happen with Ping too</i>? Surely<i> that</i> ICMP gets tunneled as well? I mean, it's called "ICMP tunneling" isn't it? And how can we explain the differences in behaviour between the "first" and "second" tunnel wrt ping and traceroute times?<br />
<br />
Well, if you're pinging the a router directly, then it probably knows where you are (not necessarily so for others, particularly pure LSRs (P Routers), but in many SP networks, the intermediate routers may well know what's going on for their customers - at least within a "region" of some kind, but not for the global Internet), and it knows who it is - unlike a traceroute ICMP Echo which instead has a destination of the ultimate destination IP - and they're also likely to know about some of their neighbouring routers in that situation. But as far as that intermediate host router in the MPLS tunnel is concerned, the route may be entirely unknown, and it will typically spew that packet out the other end of the tunnel, (so the RTT becomes that of the tunnel endpoint that has enough routing information to return the packet).<br />
<br />
A tracert to a generic host on the Internet is quite unlike a ping packet addressed to the router itself, which has a source and destination IP it might know how to route, particularly if the routers between itself and the PE router the customer uses have routes to the customer's prefix, but those routers (almost by definition) are not likely to have a full Internet table, so will merrily forward the TTL expired out the end of the LSP for the PE router at the other end to return (often down another LSP).<br />
<br />
<i>Or perhaps of course it's MUCH simpler than that</i> - direct ping packets to at least some MPLS tunnel (P LSR) routers <i>don't end up in the MPLS tunnel in the first place</i>, because that destination IP (unlike ICMP traceroute to say 8.8.8.8, which is shoved into the tunnel) isn't part of the prefixes shoved into the MPLS tunnel. That would certainly explain observed behaviour in concert with different "areas" of an SP network having different routing information. Pings to later (more distant) P LSRs may end up in an MPLS tunnel, which is why some Pings end up being similar to the traceroute RTTs on all the intervening hops, whereas some are much closer to the expected "speed of light" time - and this is a function of distance, and seems moderately sane for the design off a large network from "basic principles". This, I think, lies at the heart of the much shorter pings direct to many of the routers themselves (vs much longer ICMP traceroute RTTs through the same routers). Either this paragraph or the one above explains behaviour b). I far favour this explanation, as the one above seems like gypsy magic that doesn't gel well with my understanding of routers - but it was my first "wtf is going on here" hypothesis.<br />
<br />
With that minor brainwave (customer vs service provider vs global Internet routing), I thought "hey, what happens if I compare traceroute/ping test to another customer of our ISP"? Long story short, "it depends" and I don't know of enough customers or enough about the SP's MPLS topology to definitively work out what happens; those on the same PoP as us work as expected; those a router away also as expected; the rest, a bit of a mess and somewhat uncertain. I guess <i>Google Moar</i>! is in order.<br />
<br />
It's probably easiest to just accept on the whole, MPLS tunnel routers (P LSRs) with icmp tunneling set up are going to end up appearing to "report" (an oversimplification of what's going on - your host is in charge of the timing, and it's because the packet has travelled to the end and back that it gets that longer RTT time) a lot of RTTs from the far end of the tunnel - that at least explains the observed unexpectedly long (and consistent for several widely spaced hops) traceroute behaviour (part a) of the puzzle).<br />
<br />
<div>
Of course, routes change, so it's worth making sure your assumptions haven't changed in the middle of an investigation - here's the latest tracert from my windows machine; it's well worth knowing "typical" and "expected" routes from your location, but don't be too hasty to report a "fault", as your service provider knows their network better than you do...!</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmu-kvx_ai0C_WoHG7HOe8Fd1hRiITjeZL-T1vD3fUb03ke27nh44WxJSMXNiZAaD1E4DwwXCkwFPMya3qytOP2I_GCT2qIP7hMT3GAis1qkUlKZOHZXoPyurHA0zMeMq9xvmuWVwIASc/s1600/routingchange.PNG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="301" data-original-width="696" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmu-kvx_ai0C_WoHG7HOe8Fd1hRiITjeZL-T1vD3fUb03ke27nh44WxJSMXNiZAaD1E4DwwXCkwFPMya3qytOP2I_GCT2qIP7hMT3GAis1qkUlKZOHZXoPyurHA0zMeMq9xvmuWVwIASc/s1600/routingchange.PNG" /></a></td></tr>
<tr><td class="tr-caption" style="font-size: 12.8px;"><i>Compare it with the earlier one from Ubuntu<br />It's going the other way around the country!</i></td></tr>
</tbody></table>
I'll wireshark it to demonstrate the major differences between ICMP traceroute and ICMP ping; if you don't know wireshark, it's worth adding to your toolkit, it makes figuring stuff out quite easy:<br />
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi2Ib0PChcTlvyn351N_cBCS9Pm46LNzSHszPoVQlbee1rWt6jbXfkzTFat-frhjHs68OdY_4LR8R4JUy0SLpHcSA4xOKtURkoiwk3Yoc-zzJ8McePGmNCxL2u3roAPzJqKLJy7IVwlA4s/s1600/wiresharkTracertPing.PNG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="654" data-original-width="1090" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi2Ib0PChcTlvyn351N_cBCS9Pm46LNzSHszPoVQlbee1rWt6jbXfkzTFat-frhjHs68OdY_4LR8R4JUy0SLpHcSA4xOKtURkoiwk3Yoc-zzJ8McePGmNCxL2u3roAPzJqKLJy7IVwlA4s/s1600/wiresharkTracertPing.PNG" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Partial Wireshark (filtered) of a ICMP Traceroute from my PC to 8.8.8.8, <br />
(tail end of tracert) <br />
followed by (highlighted packet onwards) by a Ping directly to the first ~30ms host. </td></tr>
</tbody></table>
<div>
As you can see, Traceroute ICMP packets have (logically!) the destination address of the desired end host. Those MPLS tunnel routers don't know what to do with the response, so it goes out the other end, and the remote PE router sents it back, with the result that the RTTs are ~the RTT to the end of the tunnel and back again (i.e. a bunch of ~30ms RTTs); the "traditional ping" of course has a dst of the router itself, and clearly at least those in the "first" MPLS tunnel have enough route information to get directly back to me (but not with a dst of 8.8.8.8) - or, perhaps, don't go via the MPLS tunnel in the first place - those in my hypothetical second tunnel seem not to escape that fate (and that fits with routers that don't have full routing info available for the entire SP and client network prefixes, let alone global routes - but also fits with different routing depending on how things get stuffed into the tunnels).</div>
<div>
<br /></div>
<div>
If we think of MPLS tunnels (well, LSPs, strictly) as being unidirectional from point A (PE A) to point B (PE B), depending on which end of a tunnel you're at (and whether your P router in that tunnel has relevant routing information) may have an effect and may explain the observed differences in ping vs traceroute times; obviously, the routing tables on the MPLS tunnel routers (P routers) play a role. I'm speculating here, before going off to learn more about this stuff, but it kind of makes sense. I don't have access to a host at the other end to test if the reverse pattern holds true, but it would be fun to find out...<br />
<br />
I could (will!) probably hack a test network together to figure this out more or less definitively with a bunch of Mikrotik routers <a href="https://wiki.mikrotik.com/wiki/Manual:MPLS">which support MPLS</a> and <a href="https://wiki.mikrotik.com/wiki/Manual:MPLSVPLS">MPLS/VPLS</a> and <a href="https://wiki.mikrotik.com/wiki/Manual:BGP_based_VPLS">BGP-based VPLS</a> - if you don't have a lot of money and/or an employer with a test lab, Mikrotiks let you play with a LOT of amazing tech at very low prices - go buy a bunch of RB750s and play. "Book learning" gets you quite far, but playing with things yourself really cements that knowledge, and often gives you a better "gut feel" of what is going on. Unfortunately, I can't see an <i>icmp tunnel </i>option, so that may reduce my ability to figure out exactly what's happening - the service provider's network is mainly Juniper and Cisco. Of course, one day, I'll probably bump into a SP engineer and say "Hey, this is odd; I understand MPLS makes the traceroutes odd and more or less what's going on there, but why does Ping do <i>this</i>? Is it *this* (limited customer routes on some P LSRs - less likely, because then surely it could return the TTL expired from ICMP traceroute) or *this* (pings destined to (some) P routers don't get tunneled) that explains this "odd" behaviour, <i>or is it something else I haven't even learned about yet</i>?".<br />
<br /></div>
<h2>
Exposing MPLS tunnels with traceroute options</h2>
<div>
<br /></div>
<div>
Of course, you can also show mpls tunnels with traceroute (at least if the operator allows it) - at least in Unix style traceroute implementation, with the -e flag:</div>
<div>
<br /></div>
<div>
sudo traceroute -I -e -n 8.8.8.8</div>
<div>
traceroute to 8.8.8.8 (8.8.8.8), 30 hops max, 60 byte packets</div>
<div>
1 10.2.255.253 14.648 ms 14.614 ms 14.609 ms<br />
2 10.255.0.1 0.385 ms 0.378 ms 0.371 ms<br />
3 196.21.242.161 0.511 ms 0.507 ms 0.621 ms<br />
4 196.21.242.1 1.358 ms 1.355 ms 1.476 ms<br />
5 155.232.5.4 <MPLS:L=24043,E=0,S=1,T=1> 35.028 ms 35.026 ms 35.083 ms<br />
6 155.232.6.41 <MPLS:L=24196,E=0,S=1,T=1> 31.653 ms 31.443 ms 31.709 ms<br />
7 155.232.64.68 <MPLS:L=639531,E=0,S=1,T=1> 14.075 ms 14.046 ms 14.041 ms<br />
8 * * *<br />
9 155.232.1.36 <MPLS:L=310480,E=0,S=1,T=1> 34.551 ms 34.390 ms 34.364 ms<br />
10 155.232.129.17 30.850 ms 30.838 ms 30.900 ms<br />
11 72.14.211.150 30.928 ms 30.890 ms 30.805 ms<br />
12 72.14.239.117 34.398 ms 34.413 ms 34.451 ms<br />
13 8.8.8.8 30.953 ms 30.739 ms 30.842 ms</div>
<div>
<br /></div>
<div>
(the -n flag disables the reverse lookups of the IP addresses and is far more readable in this example, but they can be useful, so drop that flag if you want to). Much like with DNS (dig vs nslookup), the Unix toolset beats the pants off what's in Windows (and further compare the wonders of mtr [the e key will reveal MPLS there too] with pathping!). Interestingly, I saw different results between -I and non-I versions. </div>
<div>
<br /></div>
<div>
Oh look! MPLS labels! Mystery solved - definitely some MPLS tunnels - several of them (more than 2) - see the L values. </div>
<h2>
<br />Conclusion</h2>
<div>
<br /></div>
<i>If you see a bunch of similar ping times in a tracert, with a big jump between two hops (that isn't the result of say a transoceanic transit) and wildly different direct ping to each hop vs ICMP traceroute round trip timings(RTTs), suspect MPLS tunnels with icmp tunneling enabled. </i><br />
<br />
It kind of leaves me itching to see more of the config on the service provider's routers...<br />
<br />
It's a jump from mid-size enterprise networking knowledge levels into large Service Provider/massive enterprise networking, but it's useful to know about! That's one thing I enjoy about networks - they're fun puzzles, and there is always more to learn. Anyway, I've learnt something new and useful about interpreting Traceroute results today, and found a satisfying answer to something that's sort of bothered me for years - maybe you have too! I'll probably come back and tweak this article as I learn more about MPLS and my understanding solidifies.<br />
<br />
<h2>
Further reading:</h2>
<br />
<ul>
<li><a href="https://www.nanog.org/meetings/nanog49/presentations/Sunday/mpls-nanog49.pdf">https://www.nanog.org/meetings/nanog49/presentations/Sunday/mpls-nanog49.pdf</a> - MPLS for Dummies - well worth a read as a "primer" on MPLS. </li>
<li><a href="https://www.slideshare.net/RichardSteenbergen/a-practical-guide-to-correctly-troubleshooting-with-traceroute">https://www.slideshare.net/RichardSteenbergen/a-practical-guide-to-correctly-troubleshooting-with-traceroute</a> - Correctly troubleshooting with traceroute - absolute gold!</li>
<li><a href="https://www.caida.org/publications/papers/2012/revealing_mpls_tunnels/revealing_mpls_tunnels.pdf">https://www.caida.org/publications/papers/2012/revealing_mpls_tunnels/revealing_mpls_tunnels.pdf</a> - Research paper about detecting MPLS tunnels</li>
<li><a href="http://rtodto.net/traceroute-in-mpls-icmp-tunneling/">http://rtodto.net/traceroute-in-mpls-icmp-tunneling/</a> - Traceroute behaviour in MPLS tunneling</li>
</ul>
</div>
James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com0tag:blogger.com,1999:blog-5949954951402579361.post-8091817497815262192018-02-16T11:36:00.000+00:002018-02-16T16:40:29.312+00:00Chromebooks? Yes Please. We've started seeing more and more Chromebooks.<br />
<br />
To those in education overseas, they're not exactly news, but they have recently become (slightly) less unusual in South Africa, and are (intermittently) available from local suppliers. With the advent of Android-compatible models, we can now use them across all of our "core" software.<br />
<br />
So far, I've been very pleased with them from a sysadmin point of view.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDReNM3Jd3RI35KwZmMWGiBpiwLBaURr1zJzKt9PBNej1qmhnkkTsnM3EfFC8aftreJ7WfMokj7WkHEm-OC8Nxe-xV17P_1dwFJfYpI7nRMv1xwfoTp0NRlEhY1vHyb18PGf8LPamX3LI/s1600/AcerChromebookR11_C738T_black-photogallery-01.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="380" data-original-width="420" height="289" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDReNM3Jd3RI35KwZmMWGiBpiwLBaURr1zJzKt9PBNej1qmhnkkTsnM3EfFC8aftreJ7WfMokj7WkHEm-OC8Nxe-xV17P_1dwFJfYpI7nRMv1xwfoTp0NRlEhY1vHyb18PGf8LPamX3LI/s320/AcerChromebookR11_C738T_black-photogallery-01.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Acer R11 C738T</td></tr>
</tbody></table>
<br />
Read on for more experiences....<br />
<br />
<a name='more'></a>We had a slow start to this, as the devices that were ordered were delayed in their arrival by quite some time (due to some shipping mistakes by the distributor). I suspect they went to another similarly named school in another province (we've had that happen with Apple products before, too!). That requires someone to <i>really</i> not be paying attention in the packing and shipping depts. :/<br />
<br />
Of course, that has nothing to do with the devices themselves...<br />
<h3>
</h3>
<h3>
</h3>
<h3>
First: The Bad/Ugly</h3>
<br />
I've found one thing I *don't* like about Android-compatible Chromebooks - the managed app store (i.e. where you restrict apps to those you approve) is "frozen in time" to whatever apps were approved there <i>at the time the user first signed into the Chromebook</i>.<br />
<br />
We've found one work-around/dirty hack - sign the user out, remove the user at the sign-in screen, and get them to sign in again, and it'll update (and freeze app selection again!) but at least you'll have the "current" app set.<br />
<br />
Google support say this issue has been flagged with their engineers, but as they still consider Android on ChromeOS "Beta"; there's no ETA on when this will be fixed, but it seems like the sort of thing they'll iron out soon(ish).<br />
<br />
Google support suggested one other "hack" to address this- if you force-provision an app after you approve it, it will also wind up on enrolled devices - this is useful for "non-negotiable" apps, but seems a bit "heavy handed" for optional apps. You can of course force-install on a per OU basis, which can be useful, but also makes things more complex and unpredictable (unless, of course... internal documentation!).<br />
<br />
As the only real "wrinkle" I've found so far, it's not a train-smash, and other than that, I'm kind of left gawping at them going "why the hell are these not the mandatory BYOD schoolwork device?"...<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjDWgqwtABZRexOhQPhAaTaKtNvjMsSlvdopK6QUGx-ItABUjB3XAiZyEN4Rx5cmNp7JtQGG_rJUKHpeJBS88YiAGrjqWw8OAoxKWiPaWv5Ocu8jZb2ygtB00CZMakRjUJOml8a-BtIVq0/s1600/AcerChromebookR11_C738T_black-photogallery-04.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="380" data-original-width="420" height="289" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjDWgqwtABZRexOhQPhAaTaKtNvjMsSlvdopK6QUGx-ItABUjB3XAiZyEN4Rx5cmNp7JtQGG_rJUKHpeJBS88YiAGrjqWw8OAoxKWiPaWv5Ocu8jZb2ygtB00CZMakRjUJOml8a-BtIVq0/s320/AcerChromebookR11_C738T_black-photogallery-04.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Tablet mode. For when somehow a keyboard is not ideal</td></tr>
</tbody></table>
<h3>
</h3>
<h3>
The Good</h3>
<div>
<i>Everything. Else. </i></div>
<div>
<br />
We've mainly seen <a href="https://www.acer.com/ac/en/ZA/content/professional-model/NX.G55EA.004">Acer R11 (mostly C738T</a>) models, and we really like them, and a few <a href="https://www.acer.com/ac/en/US/content/model/NX.G54AA.002">CB5-132T</a>; spec wise, they seem pretty much identical, other than the colour! Sure, there's a few small learning curves for all involved, but they're just so much better than Android or Apple tablets that it's not even funny. </div>
<div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvg6cDFtB1AXGNtMbnqOCrCOyISy6HEC2LTXxjYybU4RsZa2RDUFwKEX2DtCeFruVBuj1ByuwtsQOOzIfuWEt4ZhKm58XbtEc-YJQHLEnlAPuDTEfdq8AflU4u9S93-rVtmHyCMsDC29w/s1600/AcerChromebookR11_CB5-132T_white-photogallery-03.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="380" data-original-width="420" height="289" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvg6cDFtB1AXGNtMbnqOCrCOyISy6HEC2LTXxjYybU4RsZa2RDUFwKEX2DtCeFruVBuj1ByuwtsQOOzIfuWEt4ZhKm58XbtEc-YJQHLEnlAPuDTEfdq8AflU4u9S93-rVtmHyCMsDC29w/s320/AcerChromebookR11_CB5-132T_white-photogallery-03.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">R11 also available as a white CB5-132T.<br />
And can be used in this weird configuration.</td></tr>
</tbody></table>
<br /></div>
<div>
Particularly great features:</div>
<div>
<ul>
<li>Low cost (compared to any other commonly available device with similar specs/functionality, notably solid state storage).</li>
<li>Actual physical keyboard</li>
<li>Actual multi-touch touchscreen, with keyboard-disabling tablet mode if you want it.</li>
<li>High "ruggedness" - no moving parts.</li>
<li>Long battery life</li>
<li>Speedy reboots. </li>
<li>Lovely 1366x768 IPS screens with Gorilla Glass for scratch resistance. </li>
<ul>
<li>Reasonable screen resolution and size for the average school desk.</li>
</ul>
<li>Considerably less likely to get pushed off a desk than a Surface 3 (because: laptop form factor, with no "kickstand"). Yes, I've had a child push our imported Surface 3 off a desk during an exam. Cringe. Also, no official Surface supply in this country. </li>
<li>Obviously(!) tight integration with GSuite.</li>
<li>Locked down, centralised control (with Chrome Device Management Licenses). </li>
<ul>
<li>Getting out of it requires physical hacking of the device (to hack a new/fake serial number into the firmware), depending on settings. </li>
<li>You can lock logins down to members of your domain, which with the above really reduces resale value (of lost/stolen devices). It also means you can enforce policy fully. </li>
<li>Once a user (who owns a device) leaves, you can simply deprovision the device from centralised management, have them "powerwash" it, log in with their own personal Gmail credentials and it's "fully theirs". </li>
</ul>
<li>Centralised OS management, done by Google. </li>
<li>OS seems pretty stable and is, so far as I can see, much more secure than typical of Windows (also, less interesting to the average 0-day developer). </li>
<li>Centralised provisioning of root certificates and wireless settings.</li>
<li>Centralised, forced provisioning of "essential" apps. </li>
<li>Can work with USB and SD Cards.</li>
<li>Supports most media formats you throw at it. </li>
<li>All of the tweaks you can apply to the managed device configuration(s). So many good things!</li>
<li>"Managed" Play store feature, with (optionally) only those apps you think are worthwhile being available. </li>
<ul>
<li>There are arguments to be made here that this may be overly restrictive. </li>
<li>Of course, most children abuse this notion, and all too many will sadly abuse any other settings. The number of times you see kids just playing games in classrooms with oblivious teachers is not amusing. </li>
</ul>
<li>Pretty much all Android apps "just work". </li>
<li>Work ought to be saved in the Cloud, so if the thing gets lost/stolen/falls in the pool, they're up and running again quickly. Balanced against that, there are still offline modes, in case you're out of Internet range. </li>
</ul>
<div>
So long as you have a "true broadband" connection (and, at school level, I'm talking hundreds of megabits per second) and access to a Google Global Cache or a full Google Datacentre at low latency, life is very peachy with such devices. Obviously, you'll need robust WiFi with that... </div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjeAftdOlGjwfldemJF8r-q67Gf76t2RLdaJY4xQodao-RInuSU-CKb_63Gwu6NkYPzFIb6Yum4pM1ZByVqkUIkNDRQMg_clm9sA7wy8aBR4Fo6AZVKcbyx3ylOHJkOd0eMFhuCwhzMCP8/s1600/AcerChromebookR11_C738T_black-photogallery-02.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="380" data-original-width="420" height="289" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjeAftdOlGjwfldemJF8r-q67Gf76t2RLdaJY4xQodao-RInuSU-CKb_63Gwu6NkYPzFIb6Yum4pM1ZByVqkUIkNDRQMg_clm9sA7wy8aBR4Fo6AZVKcbyx3ylOHJkOd0eMFhuCwhzMCP8/s320/AcerChromebookR11_C738T_black-photogallery-02.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Tent mode. Handy for media consumption - <br />
or having a textbook open on your desk</td></tr>
</tbody></table>
<h3>
</h3>
<h3>
The Arbitrary</h3>
</div>
<div>
Things that might not be ideal for some, but are a non-issue for us.</div>
<div>
<ul>
<li>Limited availability - very few suppliers, very limited stock. </li>
<ul>
<li>Not an issue if you plan ahead - and not an issue in many countries. </li>
</ul>
<li>Obviously, no Windows-only software. This only affects a tiny minority of our teaching and learning, and is due to externally mandated software requirements. </li>
<ul>
<li>VDI or other Windows labs provide options to work around that. </li>
</ul>
<li>Managed/Enterprise Enrolled mode precludes the likes of <a href="https://github.com/dnschneid/crouton">Crouton</a>/Chrubuntu. </li>
<ul>
<li>Of course, if there's a compelling teaching/learning reason to need that, simply deprovision the device, have the pupil sign something promising to be extra-good, and off they/you go. One day, there will be a decent way of running a Java IDE (Hey, Oracle, port Netbeans already) and MySQL DB (likewise) on a Chromebook, and this need will fall away for us (to date, it's just the guys doing IT as a formal subject that have need of such a thing). </li>
</ul>
<li>Chrome Management Licenses are tied to a particular model "for life" - you can't, at least in GSuite for Education, simply maintain a central pool of <i>n</i> licenses in perpetuity. Increases per unit costs by about 10%. You can "recycle" within a model, however (if you RMA one, or replace one that was lost/stolen/irreparably damaged) after the previous one is deprovisioned. </li>
<li>Limited on-device storage - so long as the pupil only installs limited apps, and stores only limited amounts of media on the device, this is a non-issue. </li>
<ul>
<li>USB and SDCard support provide some useful expansion options. </li>
<li>And of course, there's all that cloud space which is what you should be using for storage.</li>
</ul>
<li>If you get a non-Android code compatible model, things are less rosy. Simply avoid!</li>
<ul>
<li>Make sure the models you suggest <a href="https://www.chromium.org/chromium-os/chrome-os-systems-supporting-android-apps">are Android compatible.</a></li>
</ul>
</ul>
</div>
<h3>
</h3>
<h3>
The Highlight?</h3>
<div>
It's the small things, sometimes. </div>
<div>
<br /></div>
<div>
<i>You can enrol users in 2FA (for a new user) without needing access to a phone number</i>. </div>
<div>
<br /></div>
<div>
This is the only platform I've seen that option for. </div>
<div>
<br /></div>
<div>
Also, once the user is logged in, they make use of a Google Prompt-like "yes/no" tap-to-allow 2FA system if they try to log in elsewhere. Obviously, if they have other 2FA capable devices, they can also be used, but for Junior School pupils who are not allowed a mobile phone, this is a real plus - particularly as if you use your own phone, firstly, it stops accepting new pairings after a while, and secondly, you'll intermittently get s storm of 2FA codes delivered to your phone from random accounts... </div>
<div>
<br /></div>
<div>
Persuade them to print out some backup codes and keep them safely, and you have a way to have junior school kids use 2FA without too much of a 'mare.</div>
<div>
<br /></div>
<div>
Thank you, Google! </div>
<h3>
</h3>
<h3>
The Future?</h3>
<div>
I think the "value proposition" of Chromebooks is high enough that moving from BYOD to 1:1 has a lot to be said for it. </div>
<div>
<br /></div>
<div>
Particularly because having one type of device (or at least, OS) in a class makes it MUCH easier for teachers to be on top of things, like making sure an app they want to use in a lesson (or flipped classroom activity) is likely to work, and so on. </div>
<div>
<br /></div>
<div>
The larger size also makes it a little harder to "life hack" your way into not paying attention in lessons</div>
<div>
<br /></div>
<div>
Less of this sort of thing is only good:</div>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjPIRIwQwIp4BnV6VMndn4Xwg8EvWIccQUFPWfXkW7NMFhg2iJ3PVRFrT5HwwM7JTfF91UghSEa8PgHzJlscnZluqkFklNs8Ckxn839N69EJR8fxbGRdI3Ai9TpkzS-0HvyswuU9oN2NWU/s1600/aid197248-v4-728px-Text-in-Class-Step-1.jpg+%25281%2529.webp" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="477" data-original-width="728" height="209" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjPIRIwQwIp4BnV6VMndn4Xwg8EvWIccQUFPWfXkW7NMFhg2iJ3PVRFrT5HwwM7JTfF91UghSEa8PgHzJlscnZluqkFklNs8Ckxn839N69EJR8fxbGRdI3Ai9TpkzS-0HvyswuU9oN2NWU/s320/aid197248-v4-728px-Text-in-Class-Step-1.jpg+%25281%2529.webp" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><a href="https://www.wikihow.com/Text-in-Class">https://www.wikihow.com/Text-in-Class</a></td></tr>
</tbody></table>
<div>
I remember also seeing some inventive student had hidden their smartphone or small form factor tablet inside their pencil case to watch a live stream of a sports match... </div>
<div>
<br /></div>
<div>
It also means you can keep a small, consistent "pool" of loan devices that are quickly usable by any student who doesn't have one right now for all the usual reasons - and have almost zero use/attraction outside of school. </div>
<div>
<br /></div>
<div>
If every student has a Chromebook, do you even need to support BYOD? </div>
<div>
Can you ban smartphones in classrooms (and on your network) outright? </div>
James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com0tag:blogger.com,1999:blog-5949954951402579361.post-90139242712365005702018-02-06T15:51:00.000+00:002018-02-07T09:18:44.594+00:00The joys of bash - some light scripting for n00bsThere's a lot to be said for scripting in a sysadmin's life - indeed, if you <i>don't</i> do any scripting, are you even a sysadmin...?<br />
<br />
I've slowly been learning bits of bash and various related Unix utilities that are useful for processing text files - like the copious log files FreeRADIUS spits out with all sorts of useful information. I like "just in time" learning - it's often the only learning I have time for...<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHul8AqtvHXWXteMWz4pFbL11C4TTLNpGx05mU_xg4VYIHY7sbud7FQdPnePLlHPpynXFr6_3-P5sLfqmorREyWuK0R4UBT22figgwftip_PxR7Hb4XLXJ_KNuAynHfSYZg8lTyFf7ml0/s1600/bashscriptscreen.PNG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="253" data-original-width="1600" height="101" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHul8AqtvHXWXteMWz4pFbL11C4TTLNpGx05mU_xg4VYIHY7sbud7FQdPnePLlHPpynXFr6_3-P5sLfqmorREyWuK0R4UBT22figgwftip_PxR7Hb4XLXJ_KNuAynHfSYZg8lTyFf7ml0/s640/bashscriptscreen.PNG" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><i>A screenshot of a bash shell script.</i></td></tr>
</tbody></table>
In particular, I wanted to know whether certain users are abusing the system, by connecting far too many devices, and to have a count of unique devices running on our LAN for a measure of the popularity of our wireless system (and perhaps for some capacity planning). We limit device numbers for various reasons, and devices <i>must</i> be registered (on paper) prior to connection - there are only <i>n</i> spaces on the form, so <i>n</i>+1 or more == breaking the rules. Sure, electronic NAC is the obvious next step, and re-investigating Packetfence remains on my list of things to do...<br />
<br />
Looking through currently connected clients where more than <i>n</i> clients are simultaneously connected is a fool's errand, even if your wireless system has such a view (UniFi has one) - and it won't catch people who break the rules over the course of a day, or when you're not actively looking, or across two different manufacturers of Access Points. A 24 hour long RADIUS logfile record on the other hand... Well, that has a lot of potential.<br />
<br />
Of course, processing of this kind is quite easily accomplished in a simple shell script...<br />
<a name='more'></a>Have a look at a logfile you're interested in doing something with. In my scenario, it was radius.log.<br />
<br />
If I issue <i>cat radius.log </i>in the correct directory (/usr/local/var/log/radius/), I'll see the contents of the file (rapidly) scroll past.<br />
<br />
(Hint: if you actually want to read the whole file, particularly a big file longer than your scroll buffer, try <i>less <filelane></i>, or open it in a text editor like vi, vim, nano or emacs). <i>less</i> seems better than <i>more</i> (which will let you see the contents a screen at a time, but you can't scroll back so easily)!<br />
<br />
Mine has thousands of lines that look like:<br />
<blockquote class="tr_bq">
<blockquote class="tr_bq">
Tue Feb 6 10:40:47 2018 : Auth: (11689929) Login OK: [USER] (from client kcwifi port 0 cli [MAC] via TLS tunnel)</blockquote>
<blockquote class="tr_bq">
Tue Feb 6 10:40:47 2018 : Auth: (11689929) Login OK: [USER] (from client kcwifi port 0 cli [MAC])</blockquote>
</blockquote>
(I've removed real data from the above example - [USER] has a username in it, and [MAC] represents a MAC address).<br />
<br />
Obviously, one is the tunneled RADIUS event, and the other is the untunneled one. You'll see in one line, we have all the info we want - a username and a MAC address - to answer our questions. But which should we use? The one with the TLS tunnel doesn't have a trailing ")" on it, which is useful (you could get rid of it with something like <i>awk</i> or<i> sed</i>, but why not just pick the "cleanest" output with all the info you need?). I actually did that first, but then noticed the cleaner output was possible by selecting the other line.<br />
<br />
There will also be other messages occasionally, which you'll want to filter out. The easiest way to get what you want is to filter out everything EXCEPT what you want - by only selecting what you DO want.<br />
<br />
One of the joys of bash scripting is that your CLI is also usually bash, so you can experiment from the command line until you're happy, and then wrap it up nicely in a shell script.<br />
<br />
You're going to make a lot of use of pipes - | - to hand off processed text to the next step.<br />
<br />
Start out with<br />
<blockquote class="tr_bq">
<i>cat radius.log | grep 'Login OK'</i></blockquote>
If you've not used it, <i><a href="https://en.wikipedia.org/wiki/Cat_(Unix)">cat</a></i> basically prints out the text of a file to screen (<a href="https://en.wikipedia.org/wiki/Standard_streams#Standard_output_(stdout)">stdout</a>); <i><a href="https://en.wikipedia.org/wiki/Grep">grep</a></i> searches for things matching a pattern - it's super-useful. A very handy command I use quite often is <i>ps aux | grep <program name> </i>when I'm trying to find out whether a process is running or not (and, if I want to <a href="https://en.wikipedia.org/wiki/Kill_(command)"><i>kill</i></a> it, what its <a href="https://en.wikipedia.org/wiki/Process_identifier">PID</a> is).<br />
<br />
This will only output lines with a successful login (obviously, you can pick something else if you're interested in errors! - <i>grep -v </i>shows the inverse i.e. "not <searchterm>").<br />
<br />
That's still twice as many records as we need, so we'll pick out the lines that say TLS in them:<br />
<blockquote class="tr_bq">
<i>cat radius.log | grep 'Login OK' | grep TLS</i></blockquote>
Now we have one event for each successful login.<br />
<br />
There's still way more info than I want - I just want a clean username and a MAC address.<br />
<br />
<i><a href="https://en.wikipedia.org/wiki/AWK">awk</a></i> is a tremendously useful program for processing text - it's pretty much designed for doing this quickly, and probably grew out of a sysadmin needing to quickly process text files (pretty much everything in Unix is a text file, by design). One of its useful functions is grabbing space delimited text. It turns out in my output, the 11th and 18th space delimited columns have the username and MAC I'm interested in:<br />
<blockquote class="tr_bq">
<i>cat radius.log | grep 'Login OK' | grep TLS | awk '{print $11, $18}'</i></blockquote>
Now I have a list of every single username and MAC; unfortunately, the username has square brackets around it.<br />
<br />
Another "old" Unix utility is pretty good at changing things on the fly: <i><a href="https://en.wikipedia.org/wiki/Sed">sed</a></i>.<br />
<blockquote class="tr_bq">
<i>cat radius.log| grep 'Login OK' | grep TLS | awk '{ print $11, $18}' | sed 's/\[//;s/\]//' </i></blockquote>
This will strip out the square brackets, leaving two columns - a clean username and a hyphen separated version of ALL UPPERCASE MAC address - which is fine for my use.<br />
<br />
You could change the format here if you wanted to (to Windows all lowercase with no delimiting, or to the more familiar colon separated version). Learning to do this yourself by hacking at the output will start you on your way to learning these tools - and is exactly how I build up this little "utility".<br />
<br />
Now we want to get down to unique instances of each pair.<br />
<br />
To start with, let's sort them (this step is not strictly necessary [<i>actually, further reading strongly suggests this is a good idea, because of how </i>uniq <i>works; one might also try </i>sort -u], but it makes it easier to see what's going on, and if it "looks right" - surprisingly, this greatly decreases execution time later).<br />
<blockquote class="tr_bq">
<i>cat radius.log | grep 'Login OK' | grep TLS | awk '{ print $11, $18}' | sed 's/\[//;s/\]//' | sort</i></blockquote>
Hey presto, a sorted list with lots of duplicates. Guess what, there's a utility specifically for finding unique things, called <i><a href="https://en.wikipedia.org/wiki/Uniq">uniq</a></i>. Here, you need to handle users not respecting your case expectations (RADIUS and Windows auth don't care, uniq does) 27bobb is also 27BobB - same user, but not the same to uniq, unless you tell it to ignore case.<br />
<blockquote class="tr_bq">
<i>cat radius.log | grep 'Login OK' | grep TLS | awk '{ print $11, $18}' | sed 's/\[//;s/\]//' | sort | uniq -i</i></blockquote>
Now we have a list of unique MAC by username. Great. But I don't want to count - I want to just know whether or not a user has broken the rules. Firstly, we don't actually need the MAC addresses anymore. Dump them with <i>awk</i> by only selecting the first (username) space delimited column.<br />
<blockquote class="tr_bq">
<i>cat radius.log | grep 'Login OK' | grep TLS | awk '{ print $11, $18}' | sed 's/\[//;s/\]//' | sort | uniq -i | awk '{ print $1}'</i></blockquote>
And now, how many per user? Turns out <i>uniq</i> has another trick - counting things.<br />
<blockquote class="tr_bq">
<i>cat radius.log | grep 'Login OK' | grep TLS | awk '{ print $11, $18}' | sed 's/\[//;s/\]//' | sort | uniq -i | awk '{ print $1}' | uniq -c </i></blockquote>
Now we have a list with a count of unique MAC addresses per username.<br />
<br />
Now I only care about students, and they have a useful property compared with all other logins - they start with two numbers (all other users start with letters). This means I can get awk to only print out entries that match that pattern, along with the unique MAC count.<br />
<blockquote class="tr_bq">
<i>cat radius.log | grep 'Login OK' | grep TLS | awk '{ print $11, $18}' | sed 's/\[//;s/\]//' | sort | uniq -i | awk '{ print $1}' | uniq -c | awk '{ if (substr($2,1,1) ~ /^[0-9]/ ) print "\t", $2,"\t", $1 }'</i></blockquote>
<div>
You'll notice there's a regular expression there that matches any numeral, and the awk basically says "look at the first two characters, if they match this pattern, print out the line in this order, else discard".</div>
<div>
<br /></div>
<div>
I then have another (unnecessary) sort, mainly so I can check my work (looking for users that haven't collapsed down to a single instance, for example). </div>
<blockquote class="tr_bq">
<i>cat radius.log | grep 'Login OK' | grep TLS | awk '{ print $11, $18}' | sed 's/\[//;s/\]//' | sort | uniq -i | awk '{ print $1}' | uniq -c | awk '{ if (substr($2,1,1) ~ /^[0-9]/ ) print "\t", $2,"\t", $1 }' | sort</i></blockquote>
<div>
I also prefer to see username before count, so I've swapped them around in the output. Notice the $<numeral> syntax for dealing with space separated "columns". </div>
<br />
Now things get complex, as I have a number of conditions. The numbers actually tell me what grade you're in (they are the last two numerals of last year you're in high school). There are different rules depending on your grade - this used to be more complex, as there were older pupils who were only allowed one, but we now allow all high school pupils up to two devices (unique MACs), and all junior pupils are allowed up to one (unique MAC).<br />
<br />
So we need some conditional awk. It's not pretty, but it works.<br />
<br />
If you know that everyone in junior school has the number 23 or above in their username, and everyone in high school has 22 or less, it's pretty easy. (You could do some basic math to work this out based on current year and have a variable that changes and is inserted into the right spots, so it's always accurate, but I just edit the two numbers each year - this is particularly useful if you get to the point of creating a little shell script file you can run whenever).<br />
<br />
Basically it says:<br />
<br />
"If the username in column 2 is 23 or above and the count is 2 or more, print it out in the order <username> <count> (again, reversing the order to my preferred order); else, if the username is less than or equal to 22 and the count is 3 or more, print it out (again, reversing the input order). Anything that doesn't match either of these conditions hasn't broken the rules and is tossed out.<br />
<blockquote class="tr_bq">
<i>cat radius.log | grep 'Login OK' | grep TLS | awk '{ print $11, $18}' | sed 's/\[//;s/\]//' | sort | uniq -i | awk '{ print $1}' | uniq -c | awk '{ if (substr($2,1,1) ~ /^[0-9]/ ) print "\t", $2,"\t", $1 }' | sort | awk '{ if (substr($1,1,2) >= 23 && $2 >= 2 ) print "\t", $1,"\t", $2; else if (substr($1,1,2) <= 22 && $2 >= 3 ) print "\t", $1, "\t", $2; }'</i></blockquote>
There's a lot of joy to building up a chunk of text like this that does something <i>useful</i>.<br />
<br />
To get a count of unique MAC addresses that are correctly authenticated is much simpler:<br />
<blockquote class="tr_bq">
<i>cat radius.log | grep 'Login OK' | grep -v TLS | awk '{ print $18}' | sort | uniq | wc -l</i></blockquote>
Of course, what's really nice is a basic report that does a bunch of processing and outputs a neat summary of the trouble-makers. Putting this into a file starting with a <a href="https://en.wikipedia.org/wiki/Shebang_(Unix)">shebang</a> with a .sh extension and chmod +x the file makes an executable script you can call at any time with ./<scriptname>.sh<br />
<br />
Here's mine:<br />
<blockquote class="tr_bq">
#!/bin/bash<br />
input="/usr/local/var/log/radius/radius.log"<br />
filedate=`cat $input | awk 'NR==1{print $1, $2, $3, $5; exit}'`<br />
uniquemac=`cat $input | grep 'Login OK' | grep -v TLS | awk '{ print $18}' | sort | uniq | wc -l`<br />
echo ""<br />
echo "File processed: $input"<br />
echo ""<br />
echo "File date: $filedate"<br />
echo ""<br />
echo The following users have been very naughty:<br />
echo ""<br />
cat $input | grep 'Login OK' | grep TLS | awk '{ print $11, $18}' | sed 's/\[//;s/\]//' | sort | uniq -i | awk '{ print $1}' | uniq -c | awk '{ if (substr($2,1,1) ~ /^[0-9]/ ) print "\t", $2,"\t", $1 }' | sort | awk '{ if (substr($1,1,2) >= 23 && $2 >= 2 ) print "\t", $1,"\t", $2; else if (substr($1,1,2) <= 22 && $2 >= 3 ) print "\t", $1, "\t", $2; }'<br />
echo ""<br />
echo "Unique MAC addresses: $uniquemac"<br />
echo ""</blockquote>
This outputs information like date, a formatted list of naughty users and unique MAC address count. It should be very easy for you to modify for your environment and needs.<br />
<br />
Another thing that is useful to learn is how to pass an argument from a command line to a script. For example, say a pupil would like to know what MAC addresses are being naughty, it would be good for you to be able to pass the username and get a script to spit out a list of unique MACs associated with that username. Of course, with more work, you could probably further modify the script above to include a report of the infringing MAC addresses associated with each currentnaughtykid, but that's more complicated than I needed, so I've not done it.<br />
<br />
Here's a script that takes an argument of username from the commandline and spits out associated unique MAC addresses. You'll see the <i>username=$1</i> line, which is bash for "take the first argument passed in the command line and set this variable to it"; that variable is later called in the line that processes the RADIUS logfile:<br />
<blockquote class="tr_bq">
#!/bin/bash<br />
filename="/usr/local/var/log/radius/radius.log"<br />
username=$1<br />
echo "$username has used the following unique MAC addresses in $filename:"<br />
cat $filename | grep 'Login OK' | grep TLS | grep $username | awk '{print $18}' | sort -u</blockquote>
<div>
This then quickly spits out all the unique MAC addresses associated with that username. You can tart it up with various formatting and spacing options, but it's a quick tool to answer a question that used to be a bit more... painful. </div>
<div>
<br /></div>
<div>
I prefer this to the alternative, which is to call for user input after execution, which you can also do:</div>
<div>
<blockquote class="tr_bq">
#!/bin/bash<br />
filename="/usr/local/var/log/radius/radius.log"<br />
echo "Please type the username you want unique MAC addresses for from $filename..."<br />
read username<br />
echo "$username has used the following unique MAC addresses:"<br />
cat $filename | grep 'Login OK' | grep TLS | grep $username | awk '{print $18}' | sort -u</blockquote>
</div>
<div>
You may have noticed I've called the filename as a variable - which means you can point it at other files (say you want to process the one from yesterday...). </div>
<div>
<br /></div>
<div>
I have no doubt there are (much) more elegant ways of achieving these features (there are, for instance, some arguably unnecessary sorts, and you could probably collapse some of the editing and selection steps with more complex regex) - there are certainly plenty of different utilities you could use within bash to the same end, or you could even write programs in perl or python etc. - and purists would say "you could do your <i>grep</i> selections in <i>awk</i> or<i> sed</i>"... Getting better is always a good aim, but getting <i>somewhere </i>is still useful, even if it's not pretty or elegant. </div>
<br />
But this works - and developing a useful tool quickly which things you already know is often better than over-complicating something like parsing a logfile for an answer to a frequently asked question like "who is breaking the rules today"? Of course, it's also a useful first step in more advanced file processing and automation - a little work now will save a lot of work over time. And eventually, you'll learn what bash scripting wizards know, and adopt those better practices.<br />
<br />
Of course, once you learn a couple of these commands, you'll likely think of other things you could usefully parse for information to questions you commonly ask...<br />
<br />
A useful exercise for the reader: for bonus points: work out how to trigger such scripts from <i>cron</i>, and get the cronjob to email the results to you! Trigger it on yesterday's file, rolled over by logrotate.James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com3tag:blogger.com,1999:blog-5949954951402579361.post-39559794451605962202018-02-01T13:43:00.003+00:002018-02-05T10:20:16.378+00:00Poorly Documented Feature: Canned Responses in Delegated Accounts (Gmail)It's no secret. We love Gmail. However, sometimes, there are features that don't get used much that are absolute "killer" features - but it turns out they're not always well documented.<br />
<br />
A common scenario is creating generic "role based" accounts that receive large volumes of mail, and to then delegate access to this mailbox to several individuals to deal with the responses.<br />
<br />
Of course, a lot of incoming emails means a lot of outgoing responses (often the exact same thing hundreds of times), and there's a really handy feature, Canned Responses, in Labs that makes this a pleasure.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhcmyyonqG8QNxDrzJIDDB82PkNpQ1erKfovFSuVtNa3iUyaJIV8SFikCROBWYwrMxruYhUO8AKjAeb1g3VHqdnMzhrNBNadmkl_SLYgYk5BVMwnkzbWXo0xBBDn1yHKeMghlB3bkai0U0/s1600/LabsCannedResponses.PNG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="98" data-original-width="598" height="52" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhcmyyonqG8QNxDrzJIDDB82PkNpQ1erKfovFSuVtNa3iUyaJIV8SFikCROBWYwrMxruYhUO8AKjAeb1g3VHqdnMzhrNBNadmkl_SLYgYk5BVMwnkzbWXo0xBBDn1yHKeMghlB3bkai0U0/s320/LabsCannedResponses.PNG" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><i>Labs Canned Responses. <br />Thank you Googler Chad P, <br />bulk email responders LOVE you.</i></td></tr>
</tbody></table>
However, if you switch across to a delegated account, shock, horror!<br />
No Labs!<br />
Does that mean no canned responses in delegated accounts? No it does not (phew)...<br />
<br />
<a name='more'></a><br />
Of course, the secret is a little lateral thinking; my colleague clearly got the caffeine hit before I did, because he suggested this first. If you log into the main account you want to delegate (i.e. sign in as the delegated account), you can enable the Canned Responses Gmail Labs feature there, and set up your desired Canned Response(s) - and those will then be available to the delegate user(s) of that account too. Yay!<br />
<br />
You may need to clear all your browser cache and cookies and/or restart Chrome if something doesn't show up in the "delegatee's" Gmail account (particularly if, despite accepting the delegation, the option to switch context to the delegated account never shows up), but once that's done, it works well, and the person (or people) handling what might amount to hundreds of requests for information will be able to answer them accurately and in detail with just four or five clicks. Of course, for some incredibly high traffic addresses, an autoresponder with the required information may be better, but most people like a "personal touch"!<br />
<br />
If that's not clear, if you've set up say scholarships@myschool.com, and you want j.randomuser@myschool.com to deal with that mail, the delegated account is scholarships@ and the "delegatee" is j.randomuser@.<br />
<br />
One more "trick" - you can't have an attachment as part of a canned response, yet canned responses often require "attached" documentation in response to a query. The solution there, of course, is to hyperlink to one or more files stored on a website, or in Google Drive with the correct sharing permissions ("anyone with the link can view" or "public on the web" are probably good options, depending on the confidentiality of the information - likely low in this situation).<br />
I usually feel it's worth using a URL shortener, even if you're simply using hyperlinks on words like "fill in the application form" - I usually use goo.gl to stay within one ecosystem, as Drive links are usually incredibly long. It may also be worth using a Team Drive to store such documentation, notably to ensure less "link rot" as staff change.<br />
<br />
This also means that the "line manager" of the role based account (who should be the one that has the login credentials) can decree exactly what the canned responses should be.<br />
<br />
Delegation is a better bet than sharing credentials, and I'm a firm believer that on the whole, email aliases are evil (loss of institutional knowledge and hand-overs can be a messy thing), and real role-based accounts are the right solution to "role based" email addresses. With Delegation, you have accountability and the ability to have several people share a job role without complications - and whilst still allowing sane use of 2FA.<br />
<br />
Happy bulk email operations!<br />
<br />
Further reading:<br />
<a href="https://support.google.com/mail/answer/138350?hl=en">Mail Delegation</a><br />
<a href="https://support.google.com/a/answer/7223765?hl=en">Enabling Mail Delegation in GSuite</a><br />
<a href="https://support.google.com/a/answer/117099?hl=en">Enable Gmail Labs (GSuite)</a><br />
<a href="https://gsuite.google.com/learning-center/products/drive/get-started-team-drive/">Team Drive</a>James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com0tag:blogger.com,1999:blog-5949954951402579361.post-70633044540034113162017-11-21T08:40:00.000+00:002017-11-21T08:45:50.314+00:00Weak defaults in IIS 8 cryptography (TLS/HTTPS/SSL)So, this isn't exactly a breaking news headline. IIS isn't the most secure web server in the universe. However, it came as a shock to get a "C" grade on testing our school's management information system portal on Qualys' rather awesome SSL Labs tester.<br />
<br />
Even if you're not governed by things like GDPR or POPI, it should be a point of ethical professional practice to ensure there isn't a hole large enough to drive a bus through in your security infrastructure.<br />
<br />
Fortunately, there's a <i>really</i> easy way of fixing this.<br />
<br />
<a name='more'></a>First of all, pick an https:// site you want to check the vague security level of - probably one of your own! Put it into <a href="https://www.ssllabs.com/ssltest/analyze.html">https://www.ssllabs.com/ssltest/analyze.html</a> and see what score it gets.<br />
<br />
Reading about securing IIS protocols and cipher suites (i.e. getting rid of the really crappy ones that get you bad compatibility grades), I found <a href="https://www.petri.com/cipher-best-practice-configure-iis-ssl-tls-protocol">this article</a>, which lead to a <a href="https://www.nartac.com/Products/IISCrypto">rather awesome little tool from Nartac</a> that makes it quick and easy to set various registry settings, and gets you from the "C" quality default IIS 8 settings on Server 2k12r2 C grade to an "A".<br />
<br />
Getting an A+ is going to take more work, as you've probably going to have to remove all "weak" ciphersuites (best practice settings below don't, but ensure broader compatibility) and use things like CAA, HPKP and HSTS. If you want to see a domain that gets an A+ at the time of writing, <a href="https://goo.gl/7DjRrB">see this</a>; you'll notice some ancient things fail to negotiate a connection because they're too old for the allowed protocols/cipher suites (and should not be used anyway).<br />
<br />
To get your A grade, simply apply the Best Practices template settings (unless one of the other templates is more appropriate to your environment, or you know what you're doing and have even more secure settings in mind), reboot, and re-test.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjGM66vL_2Ek1CRYS29bQVnOwYsTe0QwbJePYJZW8VT7HU6pyu1YO1SdsftNG3NmJqPIIUviJynPQMSFBq0sSmfc3hqB9FJ3G7q2wTwCUNCd5k7_LHo-uouHB-VDux3NXgGGgyMkRzRon0/s1600/IISCrypto.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="701" data-original-width="900" height="249" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjGM66vL_2Ek1CRYS29bQVnOwYsTe0QwbJePYJZW8VT7HU6pyu1YO1SdsftNG3NmJqPIIUviJynPQMSFBq0sSmfc3hqB9FJ3G7q2wTwCUNCd5k7_LHo-uouHB-VDux3NXgGGgyMkRzRon0/s320/IISCrypto.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">As easy as 1-2-3 reboot!</td></tr>
</tbody></table>
Doing this helps your less informed site visitors who have browsers that either fail to connect (because they're too unsafe to even allow on the Internet) or better ones with poor cipher suite choices to force reasonably secure connection parameters over TLS 1.2+ (rather than say deprecated SSL 3.0).<br />
<br />
Microsoft's official line on this is <a href="https://support.microsoft.com/en-us/help/245030/how-to-restrict-the-use-of-certain-cryptographic-algorithms-and-protoc">in this article</a>.<br />
<br />
If you want a quick overview of doing SSL/TLS right, Qualys has a <a href="https://github.com/ssllabs/research/wiki/SSL-and-TLS-Deployment-Best-Practices">nice, straightforward guide</a>.<br />
<br />
While you're at it, consider implementing compatibility with <a href="https://www.digicert.com/dns-caa-rr-check.htm">DNS-based CAA</a>, which is <a href="https://blog.qualys.com/ssllabs/2017/03/13/caa-mandated-by-cabrowser-forum">now a (madatory) practice</a> - but ensure you understand what it does and how to correctly use it in your environment. It's not mandatory to have it (yet) but good CAs must check for it and abide by the rules - so it makes sense to restrict this as much as is sensible for your domains relative to the CA(s) you use. Basically, you tell the Internet who valid CAs are for your domain, preventing any old certificate being valid/issued. You can also consider <a href="https://www.owasp.org/index.php/HTTP_Strict_Transport_Security_Cheat_Sheet">HSTS</a>/HPKP and DNSSEC/<a href="https://www.internetsociety.org/resources/deploy360/dane/">DANE</a> if you want to go to the next level, and they are appropriate in your environment.<br />
<br />
Worryingly enough, this post came about from a discussion about Internet Banking security. We were not impressed by what banks are doing - they're mostly behind the curve, and have some pretty dumb opinions on Internet security. They also like to make the lack of security your problem in their terms and conditions. :/James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com0tag:blogger.com,1999:blog-5949954951402579361.post-30785549863553437282017-10-23T11:50:00.002+01:002017-10-23T12:09:22.193+01:00RTFM: Speeding up your (Fortigate) firewall performanceI was witness to the installation of the (yes, single) Fortigate 300C firewall the school uses, however, it was not my own configuration/installation/design, although I've maintained it for several years now.<br />
<br />
We've been having intermittent issues with close to 100% CPU usage and a sort of live lock up where the Fortigate responds, but packets do not flow (some of the scanning engines (ipsengine, or ipsmonitor) monopolise CPU time and need to be restarted several times [<a href="https://www.youtube.com/watch?v=dLBx3g8cowY">three, that's the magic number</a> - <i>diagnose test application ipsmonitor 99 </i>- and wait several minutes between attempts] - or the unit rebooted, with resulting network chaos). And packets, like the spice, must flow.<br />
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj7ewHOoSqKZms5TVj5GMbTZ4AZXgkBGMEcjbraNmE5da3uyTtbI3ygEjIqx6lMOb4r2PXo23aMFzggDPjAVC5RG_1I-_S0vgMUktqgN4ISdOsZa7fAoM8RnCcibvja6z_L2nJIs3SUoF4/s1600/PacketsMustFlow.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="249" data-original-width="552" height="144" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj7ewHOoSqKZms5TVj5GMbTZ4AZXgkBGMEcjbraNmE5da3uyTtbI3ygEjIqx6lMOb4r2PXo23aMFzggDPjAVC5RG_1I-_S0vgMUktqgN4ISdOsZa7fAoM8RnCcibvja6z_L2nJIs3SUoF4/s320/PacketsMustFlow.png" width="320" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
The "death knell" for the 300C was a member of senior management unilaterally decreeing (without asking IT) that a whole year of pupils could have twice as many devices - a year ahead of schedule, and before the planned replacement to deal with the load... So I've been looking for ways to eke out a little more performance until we can afford/acquire a replacement for it.<br />
<br />
It turns out that one of the design decisions that was made was not ideal - it completely disables the use of the onboard dedicated traffic ASICs...<br />
<br />
Unfortunately, schools need quite paranoid and intense filters, and this comes at a cost (in terms of power, and price!).<br />
<br />
<a name='more'></a>Obviously, if you have a gadget that is designed to offload traffic from a busy CPU, you want to be using it.<br />
<br />
In the initial configuration, a bridged switch interface was created to allow fairly transparent connectivity between one firewall and two core/dist/agg switches. This disables offloading, as in this configuration, every packet must be handled by the general purpose CPU.<br />
<br />
If you want to do this (have multiple redundant links to the firewall), then if you have L3 capable switches, you may be better off routing two interfaces with appropriate weightings and IP addressing, or use LACP if you have MLAG (or Virtual Chassis) capable core infrastructure. Interestingly, nobody pointed this out when I was having performance issues and had a ticket open (and they had full config files) with Fortigate - it ought to be a performance red flag.<br />
<br />
If you <b>don't</b> create a switchport, you will make use of features like "fast path" - so long as the requirements are met for that traffic to be offloaded. Such features offload some of the processes/sessions that would otherwise tax your unit's CPU, leading to one or more of faster throughput or greater ability to handle a lot of traffic/connections.<br />
<br />
Aside from software switch, there are a few other features that disable this too, like enabling sFlow/NetFlow, and strict protocol header checking.<br />
<br />
You should read the FortiASIC guide for the model of Fortigate and the version of FortiOS you're using. Our 300C uses the CP6 and NP2 ASIC chipsets. This one covers <a href="http://docs.fortinet.com/uploaded/files/2151/fortigate-hardware-accel-528.pdf">FortiOS 5.2.8</a>. This will help you understand the limitations, and some tweaks that may - or may not be - relevant to your model and unique circumstances. The general <a href="http://docs.fortinet.com/uploaded/files/1954/Best_Practices_52.pdf">best practices documentation</a> is probably also worth your time.<br />
<br />
You should also note that fastpath (and related offloads), unless there is an interconnect between the NP chips (Fortigate's terminology seems to be EEI, which exists on 300C) ,<i> requires</i> the inbound and outbound (ingress/egress) interfaces to be on the same chip - check your model. For ours, any of the first 8 ports are acceptable; the last two ports are not on the ASIC - these are useful for management and, particularly, tftp uploads of firmware (ASIC accelerated ports cannot do tftp - handy to know if your unit ends up very unhappy).<br />
<br />
My other attempts to improve performance have mainly centered around <a href="https://schoolsysadmin.blogspot.co.za/2016/09/when-your-firewall-dies-and-you-need.html?showComment=1508755252171#c4697137049091165320">massively reducing the logging levels</a> (annoyingly, turning this on greatly decreases Fortigate performance). <br />
Turning off SSL Certificate Inspection (not even full SSL interception) turned out to be a terrible idea (without it, https:// connections are a mystery to the unit - with the expected results... :( ).<br />
I also tweaked <a href="http://kb.fortinet.com/kb/viewContent.do?externalId=FD33078">some of the session timers</a> as per some guide I read somewhere - shortening unnecessarily long "open" sessions clears out the state table a little more regularly, clawing back some resources.<br />
<br />
It's early days, as I only made the interface changes on Friday (the end our our Half Term break) - school is back in session now, so we'll see how it holds up and if it helps with our "live lockup" issue, or the issue where SSO doesn't always work...! <br />
I'm seeing (System>FortiView>All Sessions>filter FortiASIC Accelerated) around 17-20% of sessions being passed to the ASIC now, and there seems to be a bit of a decrease in overall CPU usage - although we've had a 99% CPU usage this morning, traffic still flowed - and that's progress, of a sort!<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiWuZnyL_a7Uj200ji-NdDfRO8hFc99aPYvBIbMEkHMlz91OF0elxt9mpG_omYud2sB8bxPxoLSVevyPVqBmgYwrIXtou7gifpECaUZCiMqep4179h8hrquMK0LVyZkwjbxRUZqNUotQa0/s1600/fortiasic.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="867" data-original-width="1600" height="173" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiWuZnyL_a7Uj200ji-NdDfRO8hFc99aPYvBIbMEkHMlz91OF0elxt9mpG_omYud2sB8bxPxoLSVevyPVqBmgYwrIXtou7gifpECaUZCiMqep4179h8hrquMK0LVyZkwjbxRUZqNUotQa0/s320/fortiasic.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">yay, accelerated connections.</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhgcHBvnOng4pjOtAGTcxvQMTx8m8IIq44rifoecLvt4SzqnEnjaoNok9OjahmlEU0oNan-RHev8k_reukYCb-umHAlXRuiVJf5E3osFLsNLzytU8HiEdlZX08sBysazcu1H22f58ko2eI/s1600/cpuusagelower.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="310" data-original-width="585" height="169" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhgcHBvnOng4pjOtAGTcxvQMTx8m8IIq44rifoecLvt4SzqnEnjaoNok9OjahmlEU0oNan-RHev8k_reukYCb-umHAlXRuiVJf5E3osFLsNLzytU8HiEdlZX08sBysazcu1H22f58ko2eI/s320/cpuusagelower.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Yay, lower CPU... I think. </td></tr>
</tbody></table>
Sometimes, going back to the manuals and reading what they have to say about things is a good move. In fact, it's always a good move when things are not... right.James Stapleyhttp://www.blogger.com/profile/10040742550730807408noreply@blogger.com0