AoIP (Audio over IP)


  • Supercharge your commercial audio
  • Flexible, Scalable, no dedicated cabling
  • Combine multiple IP Audio applications
  • Configuration, Control & Manage via GUI
  • Integrate with traditional audio systems

What is AoIP (Audio over IP)?

AoIP or ‘Audio over IP’ is a digital coding and encapsulation process used for the transmission of audio signals via IP network infrastructures. ‘Network Audio’ and ‘IP audio’ are terms used interchangeably with AoIP in describing this process – offering functionality, capability and scalability often not possible with traditional analogue audio systems.

Integrated Audio Technologies:

Integrating Audio over IP with traditional audio technology offers a powerful proposition. Freed from the shackles of dedicated cabling, AoIP can be employed to underpin connectivity, routing and transmission of audio via the versatility of  multi-service IP network infrastructures. IP audio encoders/decoders and gateway interface device entities can enable AoIP integration with traditional audio systems, devices and speakers – helping to protect existing investment whilst providing improved control and functional capability.

Commercial IP Audio describes how IP Audio technology and IP audio applications are used specifically within the various Commercial Markets and Industry Sectors.

Why Audio over IP ?

When compared to traditional audio technology, AoIP Audio sound systems enable improved application options, functional capability and installation topology design options.

The communication audio path between IP Audio entities (eg. IP microphone and IP Speaker) is not fixed like traditional audio cabling. Removing this restriction, IP Audio system designs and installation is simplified and allows for improved scalability and installation / deployment flexibility.

Unrestricted from static cabling constraint, routing of Audio over IP (AoIP) offers scalability and flexibility not available using fixed, traditional commercial 100v line audio. IP audio and AoIP routing is configured via device firmware or system management software.

Example: Audio from source entities, such as an IP microphone or IP audio player, is routed flexibly without cable concern or restriction to play-out entities / speakers.

IP audio 'Zones' are created and managed through device GUI firmware and system management software. This capability provides very flexible configuration, control and adjustment of audio zones without having to alter any wiring (required using traditional audio).


Example: Structural or functional changes to a workplace (eg. an existing office work space is converted to a production area). Speakers in this workplace are simply re-configured with new audio zone details - no alteration to cabling is required.

Audio over IP (AoIP) configuration and system management is offered via two typical GUI options:

1. Device firmware: For smaller, less complex system designs.

2. System management software: Generally used for larger more complex designs.

Both options offer significant system administration improvement and capability to traditional audio, which typically has little or no configuration / system management available.

IP audio provides functionality and capability hard to achieve or simply not possible with traditional audio. 


With growing customer expectation and demand, IP audio and supporting communications & infrastructure technologies have enabled new commercial audio capability not previously available. 

  • Web & Cloud based, 'curated' music services.
  • Hosted and premise streaming audio services
  • Audio 'QR' advertisement, notices, infotainment 
  • Embedded IP audio capability through security, safety (eg.  CCTV talkback).

How do Audio over IP sound systems work?

Most commercial audio system designs typically employ four basic entities:

  • Audio Input Source (eg. microphone, music player)
  • Audio Processing (firmware & software) 
  • Audio Routing & Distribution (transmission technology and cabling)
  • Audio Output (Play-out devices, speakers)

With traditional audio systems, audio entities are connected ‘chain like’ by a dedicated fixed cabling infrastructure. This arrangement has stood the test of time and works well for basic commercial audio applications. However, the fixed cabling creates a rigid topology which limits capability and function for more enhanced or integrated audio system designs.

Audio over IP audio systems entities are connected to the network via ethernet cables, which in turn connect to a network switch. From that point, the audio connection path is no longer physically connected, rather logically connected from end to end (streaming IP audio over IP). IP audio therefore offers improved scalability and flexibility for audio routing and integration with other network connected audio equipment and entities.

IP audio source entities such as an IP microphone contain an integrated IP ‘encoder’ to convert the electrical signal audio to IP audio. Once connected to the network the resulting IP packets are routed and streamed to one, many or all audio play-out / IP speakers. IP speakers contain an integrated IP ‘decoder’ and mini-amplifier to convert IP audio back to electrical audio and activate the speaker driver for audio 

Audio over IP - 'Entities'

The choice of IP audio devices and product entities is primarily based on the required audio application/s. For example, product devices required for a simple IP Paging, IP PA, IP Tannoy application are likely to include IP Microphone and IP Speaker entities. Depending on the scale and complexity of the design, device firmware or dedicated software may be required to provide enhanced configuration, control and management.  

ip audio system entities

IP Audio 'Entities' and Products

IP audio device products and supporting technologies and standards are rapidly evolving. We also see some device products designed offering multiple entity capabilities (eg. a single device that offers both IP encoding and IP decoding). Creating a design solution that employs and integrates traditional audio and IP audio can add further consideration and sometimes complexity to product and device selection. 

However, for many commercial IP audio projects, product devices can largely be defined and grouped into category ‘entities’ as shown:

IP Speakers plug into the network for service (just like any other network device). Most IP speakers also receive power from the network switch (Power over Ethernet, PoE), in which case no additional cables or connections are required. IP speakers are essentially an IP audio decoder with a 'built-in' amplifier to provide output from the speaker itself. Different manufacturer IP speakers provide different capability - for example, some have a built-in microphone to enable 2-way audio talkback.

IP Microphones connect to the network via a standard network cable. Some IP Microphones require main power, others use Power over Ethernet (PoE). IP Microphones are operated just like regular microphones. Single zone and multi-zone IP Microphones are available. Physical button and screen-based IP Microphones are available. We also offer an IP Console solution which is GUI software based, using a PC microphone / headset for audio. 

These devices are used to 'en' code traditional electrical audio signals for interface and distribution by IP and network infrastructures. For example, a traditional CD player could be connected to an IP audio encoder which could then be used as an IP background music (IP BGM) source for streaming and play-out through single or zone groups of IP speakers.

These devices are used to 'de' code IP audio for interface and connection to traditional audio speakers and system equipment. For many new audio installation projects or to 'IP enable' existing traditional audio systems, we will often connect an IP audio decoder to an input of a traditional audio mixer/amplifier. This can be transformative in both system design / topology and capability with play-out from streaming IP audio from an IP microphone, or other IP audio source (internet radio, music streaming service or an IP audio encoder). For large project sites running traditional audio, IP audio encoder / decoders could be used to distribute audio between locations and buildings, improving and simplifying audio routing capability whilst minimising local cabling investment. 

IP audio products and devices need to be configured to enable audio function and capability. Firmware is the running configuration that resides on the IP audio device (eg. IP speaker). Software is generally separate from the IP audio devices and works with the device firmware to provide enhanced function or capability. For example, some IP audio suppliers offer software/server combination for control and management for larger and more complex IP audio system installations.

IP audio can originate from a number of source entities. This could be from a traditional CD player or radio tuner connected to an IP audio 'en' coder, an internet radio or from an internet music content provider. An IP audio Player device can be used to combine many different play-out options, including audio messages (eg. security, safety announcements). Importantly, IP audio players can also run to time schedule/s  - eg. initiating play-out of music, radio and messages to time schedules (BGM with advertisements or covid announcements would be a good example).

Audio over IP - Applications

Audio over IP applications are widely employed within different commercial and industry business sectors.

AoIP applications are not technically specific or exclusive and can be combined collectively with other IP audio applications. Mixed-technology audio applications (IP audio & Traditional audio) can also be integrated to create highly capable and cost effective audio design solutions.

AoIP - Protocols & Standards

This section provides outline technical information about Audio over IP (AoIP) protocols and standards and how IP audio and wider IP sound systems communicate and operate within ‘IP Networks’ (IP protocol suite, aka TCP/IP).


Like any IP network connected device, all IP audio entity devices, from different manufacturers adhere to the principal constituent protocols and standards which are maintained and guided through the IETF (Internet Engineering Task Force)

Here, we discuss IP audio, identifying notable network protocols and standards and explore how they are used by developers and industry manufacturers.

OSI Layers IP Audio (protocols and standards)

IP Audio & Networking Frameworks (OSI & TCP/IP)

In the context of network inter-communication, there are a number of very well discussed networking frameworks that are used to enable different computer systems to connect and communicate with each other (including IP audio systems and IP audio device entities).

The OSI 7-layer model and the TCP/IP 4 layer model are two widely used and similarly layered inter-networking architectures. Many other similarities exist between the two models. However, the OSI model is a generic model that is focused on the functionalities of each layer, the TCP/IP model is more protocol oriented.

IP Audio, Layers 1 & 2

Often referred to as the ‘physical’ and ‘frame’ layers, L1 & L2 (network interface layer) holds little relationship to specific IP audio protocols or standards. Precision Time Protocol (PTP) is a notable exception – this protocol is used by many IP audio applications to provide clock synchronisation, enabling accurate timing for audio capture and playout.

Generic networking protocols and standards apply, most commercial IP audio entities connect to the network via an industry standard RJ45 network interface controller / ‘NIC’ and structured network cabling infrastructures (eg. cat5e/cat6). Some IP audio entities are wireless based, typically connecting to the network via WiFi. All network connected devices, including IP audio contain a unique Media Access Control (MAC) address, building a tabled relationship with the serving network switch. Virtual LAN’s (VLAN’s) are commonly configured to provide IP audio applications a dedicated network broadcast domain – also simplifying administration and improving access control / security.

IP Audio, Layer 3

Layer 3, IP addressing & routing, the network layer is often referred to as the IP ‘packet’ and ‘logical’ layer. IP audio applications use and depend on this layer to manage and move audio across networks. 

Unlike MAC addresses (as we discussed for layer 2 / network interface layer), IP addresses can refer to a single network device entity, a group or an entire broadcast domain (uni-cast, multi-cast, broadcast). IP audio application protocols utilise IP addressing capability and flexibility to control and route audio as well as to create logical grouping for playout assignment (eg. audio paging zones or lower priority background music, BGM zones). This capability has clear advantages over traditional audio and fixed entity assignment via dedicated cabling infrastructures. IP audio traffic routed via multicast (let’s say from a music source to a BGM music zone) is not without it’s challenges. To help with resourse overhead from multi-cast flooding, IGMP snooping is used by some IP audio applications to establish multicast membership.

Quality of Service (QoS) is a term used to describe traffic marking and resource control measures to provide prioritisation to time sensitive data traffic (such as audio broadcast). IP audio applications often utilise standard VoIP QoS protocols such as DSCP / DiffServ to help expedite audio traffic ensuring low latency audio transmission.

IP Audio, Layer 4

Transmission Control Protocol (TCP) is a connection based protocol ensuring reliable network traffic transmission – eg. fixing lost or un-sequenced packets. However, this comes at a cost…’time’ which adds to overall transmission and audio latency. For this reason, User Datagram Protocol (UDP) is the preferred transport layer protocol for IP audio applications. UDP is connection-less based, is simple and importantly quicker than TCP. 

IP Audio, Layer 5

Real Time Protocol (RTP) is an established network payload protocol used by VoIP and AoIP applications. RTP typically runs over UDP to limit resources and expedite IP audio traffic (see layer 4).

To establish, maintain and monitor IP audio streams, RTP requires help from signalling protocol companion/s. Some IP audio applications use bespoke signalling protocols, many use Session Initiation Protocol (SIP). SIP establishes the connection, RTP is used to carry the audio stream. RTCP (Real Time Control Protocol) is used in conjunction with RTP to monitor audio transmission, working to provide quality of service (QoS) and aid multi-stream synchronisation.

IP Audio, Layer 6

For many Audio over IP (AoIP) audio applications, the presentation layer is responsible for assisting with RTCP/RTP function as well as providing encoding / decoding (codec) function for audio streams. IP audio applications use bespoke codes and established generic codecs. The choice of codec determines audio compression and thus the bandwidth (traffic resource impact) and overall quality of the audio stream. For example a virtually loss-less audio codec such as FLAC (Free Losless Audio Codec) will provide high audio quality. A lossy audio codec such as MP3 incurrs a lower overall traffic overhead, improving audio streaming over slower speed networks, but will often result in in lower quality audio fidelity. This is obviously open to subjectivity as well as against the commercial audio installation setting (for example streaming MP3 music within a relatively noisy workplace setting is quite a different prospect when compared to a quiet studio or theatre setting).

IP Audio, Layer 7

Audio applications User Interface (UI). Some audio applications use dedicated software (server and client) to provide a Graphical UI (GUI) interface to enable user access to configure and control the wider audio design solution. Browser based GUI via HTTP offers a popular user interface as it requires no additional software. Standard browsers communicate with IP audio device entity web interfaces – interacting with device firmware for configuration and wider UI control.

Commercial IP Audio

Want more details about IP Audio?
Discuss a new IP audio system design or project?
Want to learn more about IP audio applications?