Skip to content

AoIP (Audio over IP)

GET CONNECTED

  • Supercharge your commercial audio
  • Flexible, Scalable, no dedicated cabling
  • Configuration, Control & Manage via GUI
  • Microsoft Teams Integration
  • Integrate with traditional audio systems

What is AoIP (Audio over IP)?

AoIP or ‘Audio over IP’ is a digital coding and encapsulation process used for the transmission of audio signals via IP network infrastructures. ‘Network Audio’ and ‘IP audio’ are terms used interchangeably with AoIP in describing this process – offering functionality, capability and scalability often not possible with traditional analogue audio systems.

Why Audio over IP ?

When compared to traditional audio technology, AoIP sound systems enable improved application options, functional capability and installation topology design options.

The routing and communication audio path between AoIP entities (eg. coders, microphones, processors, players & speakers) is not fixed like traditional audio cabling. Removing this restriction, AoIP enables system design options not possible with traditional audio (eg. improved scalability and installation / deployment flexibility).

How do Audio over IP sound systems work?

Most commercial audio system designs typically employ four basic entities:

  • Audio Input Source (eg. microphone, music player)
  • Audio Processing (firmware & software) 
  • Audio Routing & Distribution (transmission technology and cabling)
  • Audio Output (Play-out devices, speakers)

With traditional audio systems, audio entities are connected ‘chain like’ by a dedicated fixed cabling infrastructure. This arrangement has stood the test of time and works well for basic commercial audio applications. However, the fixed cabling creates a rigid topology which limits capability and function for more enhanced or integrated audio system designs.

Audio over IP audio systems entities are connected to the network via ethernet cables, which in turn connect to a network switch. From that point, the audio connection path is no longer physically connected, rather logically connected from end to end (streaming IP audio over IP). IP audio therefore offers improved scalability and flexibility for audio routing and integration with other network connected audio equipment and entities.

IP audio source entities such as an IP microphone contain an integrated IP ‘encoder’ to convert the electrical signal audio to IP audio. Once connected to the network the resulting IP packets are routed and streamed to one, many or all audio play-out / IP speakers. IP speakers contain an integrated IP ‘decoder’ and mini-amplifier to convert IP audio back to electrical audio and activate the speaker driver for audio output.audio 

Audio over IP - 'Entities'

The choice of IP audio devices and product entities is primarily based on the required audio application/s. For example, product devices required for a simple IP Paging, IP PA, IP Tannoy application are likely to include IP Microphone and IP Speaker entities. Depending on the scale and complexity of the design, device firmware or dedicated software may be required to provide enhanced configuration, control and management.  

ip audio system entities

IP Audio 'Entities' and Products

IP audio device products and supporting technologies and standards are rapidly evolving. We also see some device products designed offering multiple entity capabilities (eg. a single device that offers both IP encoding and IP decoding). Creating a design solution that employs and integrates traditional audio and IP audio can add further consideration and sometimes complexity to product and device selection. 

However, for many commercial IP audio projects, product devices can largely be defined and grouped into category ‘entities’ as shown:

Audio over IP - Applications

Audio over IP applications are widely employed within different commercial and industry business sectors.

AoIP applications are not technically specific or exclusive and can be combined collectively with other IP audio applications. Mixed-technology audio applications (IP audio & Traditional audio) can also be integrated to create highly capable and cost effective audio design solutions.

AoIP & Integrated Audio Technologies

To provide improved functional and operational capability, many new commercial audio systems are now designed and installed using AoIP device entities. Some new system installation projects may even use AoIP as an exclusive audio technology. However, it is important to acknowledge and recognise that far various reasons, AoIP may not be the ‘best fit’ audio technology for some commercial projects. For a start, we must consider the huge installation base of existing traditional commercial audio (legacy) installations. For smaller sites, sites with limited network infrastructure or for audio designs only requiring very basic function/capability, traditional audio is likely to remain an attractive technology option.

In such cases, combining ‘both’ audio technologies to benefit from the innovation and capability of AoIP with the huge established traditional audio markets can offer a powerful proposition for many commercial audio installation projects.

Integrated Audio: example

MS Teams integration with AoIP and legacy audio

ms teams paging saas sip service provider

AoIP - Protocols & Standards

This section provides outline technical information about Audio over IP (AoIP) protocols and standards and how IP audio and wider IP sound systems communicate and operate within ‘IP Networks’ (IP protocol suite, aka TCP/IP).

Outline

Like any IP network connected device, all IP audio entity devices, from different manufacturers adhere to the principal constituent protocols and standards which are maintained and guided through the IETF (Internet Engineering Task Force)

Here, we discuss IP audio, identifying notable network protocols and standards and explore how they are used by developers and industry manufacturers.

OSI Layers IP Audio (protocols and standards)

IP Audio & Networking Frameworks (OSI & TCP/IP)

In the context of network inter-communication, there are a number of very well discussed networking frameworks that are used to enable different computer systems to connect and communicate with each other (including IP audio systems and IP audio device entities).

The OSI 7-layer model and the TCP/IP 4 layer model are two widely used and similarly layered inter-networking architectures. Many other similarities exist between the two models. However, the OSI model is a generic model that is focused on the functionalities of each layer, the TCP/IP model is more protocol oriented.

IP Audio, Layers 1 & 2

Often referred to as the ‘physical’ and ‘frame’ layers, L1 & L2 (network interface layer) holds little relationship to specific IP audio protocols or standards. Precision Time Protocol (PTP) is a notable exception – this protocol is used by many IP audio applications to provide clock synchronisation, enabling accurate timing for audio capture and playout.

Generic networking protocols and standards apply, most commercial IP audio entities connect to the network via an industry standard RJ45 network interface controller / ‘NIC’ and structured network cabling infrastructures (eg. cat5e/cat6). Some IP audio entities are wireless based, typically connecting to the network via WiFi. All network connected devices, including IP audio contain a unique Media Access Control (MAC) address, building a tabled relationship with the serving network switch. Virtual LAN’s (VLAN’s) are commonly configured to provide IP audio applications a dedicated network broadcast domain – also simplifying administration and improving access control / security.

IP Audio, Layer 3

Layer 3, IP addressing & routing, the network layer is often referred to as the IP ‘packet’ and ‘logical’ layer. IP audio applications use and depend on this layer to manage and move audio across networks. 

Unlike MAC addresses (as we discussed for layer 2 / network interface layer), IP addresses can refer to a single network device entity, a group or an entire broadcast domain (uni-cast, multi-cast, broadcast). IP audio application protocols utilise IP addressing capability and flexibility to control and route audio as well as to create logical grouping for playout assignment (eg. audio paging zones or lower priority background music, BGM zones). This capability has clear advantages over traditional audio and fixed entity assignment via dedicated cabling infrastructures. IP audio traffic routed via multicast (let’s say from a music source to a BGM music zone) is not without it’s challenges. To help with resourse overhead from multi-cast flooding, IGMP snooping is used by some IP audio applications to establish multicast membership.

Quality of Service (QoS) is a term used to describe traffic marking and resource control measures to provide prioritisation to time sensitive data traffic (such as audio broadcast). IP audio applications often utilise standard VoIP QoS protocols such as DSCP / DiffServ to help expedite audio traffic ensuring low latency audio transmission.

IP Audio, Layer 4

Transmission Control Protocol (TCP) is a connection based protocol ensuring reliable network traffic transmission – eg. fixing lost or un-sequenced packets. However, this comes at a cost…’time’ which adds to overall transmission and audio latency. For this reason, User Datagram Protocol (UDP) is the preferred transport layer protocol for IP audio applications. UDP is connection-less based, is simple and importantly quicker than TCP. 

IP Audio, Layer 5

Real Time Protocol (RTP) is an established network payload protocol used by VoIP and AoIP applications. RTP typically runs over UDP to limit resources and expedite IP audio traffic (see layer 4).

To establish, maintain and monitor IP audio streams, RTP requires help from signalling protocol companion/s. Some IP audio applications use bespoke signalling protocols, many use Session Initiation Protocol (SIP). SIP establishes the connection, RTP is used to carry the audio stream. RTCP (Real Time Control Protocol) is used in conjunction with RTP to monitor audio transmission, working to provide quality of service (QoS) and aid multi-stream synchronisation.

IP Audio, Layer 6

For many Audio over IP (AoIP) audio applications, the presentation layer is responsible for assisting with RTCP/RTP function as well as providing encoding / decoding (codec) function for audio streams. IP audio applications use bespoke codes and established generic codecs. The choice of codec determines audio compression and thus the bandwidth (traffic resource impact) and overall quality of the audio stream. For example a virtually loss-less audio codec such as FLAC (Free Losless Audio Codec) will provide high audio quality. A lossy audio codec such as MP3 incurrs a lower overall traffic overhead, improving audio streaming over slower speed networks, but will often result in in lower quality audio fidelity. This is obviously open to subjectivity as well as against the commercial audio installation setting (for example streaming MP3 music within a relatively noisy workplace setting is quite a different prospect when compared to a quiet studio or theatre setting).

IP Audio, Layer 7

Audio applications User Interface (UI). Some audio applications use dedicated software (server and client) to provide a Graphical UI (GUI) interface to enable user access to configure and control the wider audio design solution. Browser based GUI via HTTP offers a popular user interface as it requires no additional software. Standard browsers communicate with IP audio device entity web interfaces – interacting with device firmware for configuration and wider UI control.

Commercial Audio over IP - AoIP

Want more details about AoIP & IP Audio?
Discuss a new AoIP audio system design or project?
Want to learn more about IP audio applications?