Golborne Vintage Radio

Full Version: Building a Standards Converter
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
(15-03-2017, 08:19 PM)ppppenguin Wrote: [ -> ]Great thing about programmable logic is that you can play with the design so easily. No solderingSmile

Yes and if you get it wrong there is no magic smoke to escape. The great thing is that you can keep getting it wrong without any consequences until you get it right.

My regimen for learning is to write as small amount as possible and try and compile it. If it compiles successfully, great, but more often than not the compiling will fail with with many errors being reported. I find out the cause of the errors, correct them and recompile. Keep adding to the writing and repeating. The more time spent writing and correcting the fewer the number of errors.

Frank
Writing a small amount and getting it to compile is a good way to do things. As a beginner in VHDL you will make lots of mistakes that the compiler picks up. Usually with unhelpful error messages. Even as an experienced designer I can sometime be scratching around wondering where I did something wrong.
I got some time over the weekend to have a play and decided to install the FPGA in place of the CPLD. The headers on the two boards are different so a bit of rewiring was needed. The first photo below shows the CPLD on the left and the FPGA on the right. The second photo shows the FPGA installed.

I reconfigured the files to suit the FPGA and programmed it, but there was no output. A few checks showed that there was plenty of life on the write side but no activity on the read side. The read clock was taken from the driven side of the 9 MHz microcontroller crystal and is of low amplitude, not much over a volt. I got away with using it with the CPLD but it wasn’t good enough to drive the FPGA.

I though that this would be a good opportunity to try a DTO that Jeffrey had explained.
There is an on board 50 MHz clock which I used. Then I basically copied the relevant parts from Jeffrey's file to give a 9 MHz output frequency and it worked. I used it to clock the microcontroller as well.
The output was  a bit jittery due to the low (50 MHz) clock but it gave a watchable picture.
Today I found the PLL function on the FPGA and configured it to give a X3 output giving a 150 MHz clock. I rewrote the file to suit and now the 9 MHz clock is much more stable. A DTO is extremely  useful. Thanks again Jeffrey for explaining it.

Frank  
Frank, glad you got a DTO working well. They really are a very useful part of digital design.
I've nver used Altera FPGAs so am unfamiliar with their capabilities. I assume they must be comparable to what Xilinx parts can do in any given generation and cost but with many difference of detail.

You say you multiplied the 50MHz clock up to 150MHz using a PLL within the FPGA. If the PLL can do it there's probably no harm going up to 200MHz or even higher. Again I don't know how fast the logic will run but you're only talking about a very small amount running at that frequency. Again this will vary with the FPGA make and family, but you should put some kind of timing specification when running fast logic. This is to ensure that it will actually run at that speed. Again detail will differ between Altera and Xilinx tools.

My own experience in Xilinx Spartan 3A devices is that 54MHz might as well be DC - it would be hard to design something that failed at 54MHz unless you were cascading loads of combinatorial logic without registers. At 148.5MHz I needed to take great care with pieplining to meet timespecs. The other factor is how full the device is. It's easy to get high speed in a lightly utilised device. Experience is really the only way to get a feel for this.
Hi  Jeffrey
You are quite correct, yesterday I had another play with the PLL to see what it could do, and pushed the output frequency up to 450Mhz where it worked quite happily. The DTO was adjusted accordingly to give the  9 MHz output. I have just lookup the speed grade of the device I have and it is 402 MHz oops!
I also found that the PLL could produce a 9 MHz clock itself , but then I wouldn’t have learnt anything about DTO's.
It would seem that the sensible thing to do is to clock the PLL from the 13.5 MHz write clock and get the PLL to produce 9 MHz from that, as at some stage I will probably try and use the FPGA's internal memory and to do that the 625 and 405 clocks would need to be synced.

Before I can do that I will need to move the sync generation and read control  from the microcontroller to the FPGA. This is because the video decoder, where the 13.5 MHz is sourced from starts up with a 27 MHz clock and it is only after the  microcontroller programs it that it produces 13.5 MHz. The problem with this is at start up the microcontroller would be clocked at 18 MHz. In its current set up it cant cope with that high of frequency and therefore would not be able to program the video decoder.

Generating sync in the FPGA is a totally different ball game to doing it with a microcontroller, its something I will have to spend some time thinking about to get my head around.  I have browsed your video_output file, it is very well documented and has given me some ideas. As I will try to do a 5:4 output as well an 4:3, I will need to keep that in mind while doing the syncs. I will also need to figure out how well the 5:4 clock will sync up with the 625 if using the internal memory.

While using the PLL at 450 MHz I decided to see would it be possible to get modulated RF out instead of  Video. I pulsed the outputs to the DAC R2R ladder with a 51.75 MHz clock ( channel 2), by 'AND' ing  the outputs with a  51.75 MHz clock. The results are shown below. This is just over air with no lead used. The last photo and best results were obtained by leaving a finger on the appropriate place on the R2R ladder. It might be worth pursuing at some stage.

Frank  
My experience of decoder chips is that they give 27MHz out. Not yet seen one that gives 13.5MHz but happy to be proven wrong. You could program the decoder from the FPGA. It's not too hard to write a "bit banger" that will do the job.

If there's a framestore in the system the input and output clocks can be completely independent of each other. If you are using linestores then they must be related.

Generating TV syncs in an FPGA is easy enough. Divide down from clock to half lines and a further divide by 2 for full lines. Once every half line you enable the field counter. The odd/even field falls out naturally from all this. The pulses are generated with JK flipflops. It's all there in the video_output file. It may look a bit more complicated because I have used some named constants. For me that made it simpler, especially if I was tweaking the timings. Meant all pulses moved in step with the offset. You can of course play with variable width pulses if you have some way of controlling the FPGA from an external microcontroller or PC. You'll see a load of that, effectively memory mapped "write only" registers, going on in the section that starts:

process (CKC) begin
if rising_edge(CKC) then

CKC is the control system clock. Can't remember what it was, probably same as one of the video clocks. In another VHDL module there's the interface between an x51 class CPU and my internal control buses. I've used this approach with minor variations for as long as I can remember.

I've occasionally wondered about doing a VHF modualtor in an FPGA. Nice to see you've tried it and made it work.
Hi  Jeffrey
It's a TVP5416 that I am using. The default mode is 27 MHz clock with the Y and C output in series through one port but it can be programmed  to output Y and C in parallel through separate  ports, the clock is then 13.5 MHz.  I wanted separate Y out as I thought the likelihood of  successfully separating Y and C at 27 MHz wasn’t great using a microcontroller and some logic gates as they were the components I intended to use when I started, but now with a FPGA a smaller chip like the TV5150 would be a better choice.

My goal is to eventually move all the functions preformed by the microcontroller to the FPGA. This will be done bit by bit as I learn, so hopefully one day a “bit banger” will be on the cards.

I intend to use the frame stores for now at least but it is tempting to use the memory in the FPGA. So while the need for the input and output clocks to be synchronous isn’t immediate, it would be nice to have them that way, should the need arise.

When you put the sync generating that ways it sounds straight forward, I will have a think about it and I'm sure it will all eventually fall into place.

Frank  
Code:
----------------------------------------------------------------------------------
-- Video input
-- 13/2/13: Commenced
-- 13/1/14: Can select 1 of 4 frames in SDRAM, thus allowing TCC to be present along with live video
-- 25/2/14: SIMM2 input added
-- 18/3/14: Input presence detection added
----------------------------------------------------------------------------------

library IEEE;
use IEEE.std_logic_1164.ALL;
use IEEE.std_logic_ARITH.ALL;
use IEEE.std_logic_UNSIGNED.ALL;
library UNISIM;
use UNISIM.VComponents.all;
use USEFUL_FUNCTIONS.all;    -- Useful functions such as "to_std_logic" (boolean to std_logic)

----------------------------------------------------------------------------------------------------------------------
--                             START OF ENTITY        
----------------------------------------------------------------------------------------------------------------------    
entity video_input is Port (    
    TP : out std_logic_vector(10 downto 1);    -- Test points
    CKC : in std_logic; CBLOCK : in std_logic_vector(3 downto 0); CD, CA : in std_logic_vector(7 downto 0);    -- Controls
    oINPUT_PRESENT : out std_logic;
    
-- SDRAM interface
    CKRAM : in std_logic;
    W_SLOT : in std_logic;
    SLOT_INCREMENT : in std_logic;
    oSDRAM_WRITE_ENABLE : out std_logic;
    oSDRAM_WRITE_DATA : out std_logic_vector(31 downto 0);          
    oSDRAM_WRITE_ADDRESS : out std_logic_vector(21 downto 0);

-- SIMM slot 1
    pV1R, pV1X : in std_logic_vector(9 downto 0);
    pCK1R, pLOCK1 : in std_logic;    
    pCK1SIMM : out std_logic;    
-- SIMM slot 2
    pV2R, pV2X : in std_logic_vector(9 downto 0);
    pCK2R, pLOCK2 : in std_logic;    
    pCK2SIMM : out std_logic);    
end video_input;
--============================= END OF ENTITY =======================================================================    

----------------------------------------------------------------------------------------------------------------------
--                             START OF ARCHITECTURE        
----------------------------------------------------------------------------------------------------------------------        
architecture rtl of video_input is

--------------------------------- Internal signals -------------------------------------------------------------------
-- Initialised signals
signal INVERT_INPUT_CLOCK                 : boolean                      := false;    -- Invert for analogue input
signal SELECT_SIMM2                       : boolean                      := false;
signal LOCK_IS_MANUAL                     : boolean                      := false;
signal ALLOW_SDRAM_WRITES                 : boolean                      := true;
signal VCOUNT, SET_VCOUNT_START           : std_logic_vector(8 downto 0) := "111111010";
signal SDRAM_FRAME                        : std_logic_vector(1 downto 0) := "01";

signal CKR, CK1R, CK1Rx, CK2Rx, SELECTED_CLOCK : std_logic;
signal V1Ra, V1R, V2Ra, V2R, SELECTED_VIDEO, SELECTED_VIDEO_DELAYED : std_logic_vector(9 downto 0);
signal DETECT_FF, PRE_RUNIN, RUNIN, EAV, SAV : boolean;
signal SEPARATED_H, SEPARATED_V, SEPARATED_F : std_logic;
signal HCOUNT : std_logic_vector(10 downto 0);
signal SET_MANUAL_LOCK, VREF : std_logic;
signal DELAYED_SEPARATED_F : std_logic_vector(2 downto 0);

-- DPRAM 32x10. For pingpong buffer, inferred version
type DPRAM32X10 is array(31 downto 0) of std_logic_vector(9 downto 0);
signal PING_PONG_RAM_Y, PING_PONG_RAM_C : DPRAM32X10;

signal SDRAM_WRITE_DATA_Y, SDRAM_WRITE_DATA_C : std_logic_vector(9 downto 0);
signal AREQ : boolean;
signal DP_RAM_READ_ADDRESS : std_logic_vector(4 downto 0);
signal INTERMEDIATE_ADDRESS, SDRAM_WRITE_ADDRESS : std_logic_vector(21 downto 0);    -- 22 bit
signal SDRAM_WRITE_ENABLE_START  : std_logic;
signal PRE_REQUEST, REQUEST, GRANT, SDRAM_WRITE_ENABLE : std_logic;
signal DELAYED_GRANT : std_logic_vector(7 downto 0);
signal VERY_DELAYED_GRANT : std_logic_vector(15 downto 0);
signal DELAYED_SLOT_INCREMENT : std_logic;

signal INPUT_DETECT_COUNT : integer range 0 to 16#1FFFFF#;
signal INPUT_PRESENT, INPUT_FRAME_EDGE, FF0, FF1 : boolean;
--====================================================================================================================        
        
----------------------------------------------------------------------------------------------------------------------
--                             ************** BEGIN PROCESSES ************        
----------------------------------------------------------------------------------------------------------------------        
begin

-----------------------------------------------  Controls ------------------------------------------------------------
process (CKC) begin
    if rising_edge(CKC) then
    
        if (CA = X"50") and CBLOCK(0) = '1' then
            SELECT_SIMM2       <= CD(0) = '1';
            INVERT_INPUT_CLOCK <= CD(2) = '1';
            SET_MANUAL_LOCK    <= CD(4);
            LOCK_IS_MANUAL     <= CD(5) = '1';        
        end if;

-- ALLOW_SDRAM_WRITES = not FREEZE    
        if (CA = X"51") and CBLOCK(0) = '1' then ALLOW_SDRAM_WRITES <= CD(0) = '1'; end if;
-- Vertical alignment        
        if (CA = X"52") and CBLOCK(0) = '1' then SET_VCOUNT_START(8)          <= CD(0); end if;
        if (CA = X"53") and CBLOCK(0) = '1' then SET_VCOUNT_START(7 downto 0) <= CD;    end if;

        if (CA = X"54") and CBLOCK(0) = '1' then SDRAM_FRAME <= CD(1 downto 0); end if;

    end if; --CKC
end process;
--====================================================================================================================        

--------------------------------------- Input buffering, relatching --------------------------------------------------
-- IBUFG explicitly decalred for pCK1R since it's on a global clock pin
IBUFG_CK1R : IBUFG port map (O => CK1R, I => pCK1R);    

CK1Rx <= not CK1R when INVERT_INPUT_CLOCK else CK1R;    -- Input buffering. IFD uses either inveted or non-inverted clock
process (CK1Rx) begin
    if rising_edge(CK1Rx) then V1Ra <= pV1R; end if;
end process;

process (CK1R) begin
    if rising_edge(CK1R) then V1R <= V1Ra; end if; -- Relatch with non-inverted clock
end process;

CK2Rx <= not pCK2R when INVERT_INPUT_CLOCK else pCK2R;    -- Input buffering. IFD uses either inveted or non-inverted clock
process (CK2Rx) begin
    if rising_edge(CK2Rx) then V2Ra <= pV2R; end if;
end process;

process (pCK2R) begin
    if rising_edge(pCK2R) then V2R <= V2Ra; end if; -- Relatch with non-inverted clock
end process;

-- LOCK not used at present
-- V1X, V2X not used at present

SELECTED_CLOCK <= not pCK2R when SELECT_SIMM2 else not CK1R; -- Select clock from SIMM1 or 2
BUFG_CKR : BUFG port map (O => CKR, I => SELECTED_CLOCK);    -- Global buffer for input clock

pCK1SIMM <= '0';
pCK2SIMM <= '0';
--====================================================================================================================        

------------------------------------ Input presence detect -----------------------------------------------------------
-- Absence of separated FF is adequate evidence of loss of either analogue or digital input
-- Differentiate both edges of SEPARATED_F, use to load 21 bit counter. If counter reaches 0 then input is absent
process (CKC) begin
    if rising_edge(CKC) then
    
        FF0 <= SEPARATED_F = '1';    -- To CKC domain
        FF1 <= FF0;
        INPUT_FRAME_EDGE <= FF0 xor FF1;
    
        if INPUT_FRAME_EDGE then INPUT_DETECT_COUNT <= 16#1FFFFF#;
            else                  INPUT_DETECT_COUNT <= INPUT_DETECT_COUNT - 1 mod 16#200000#;
        end if;
        
        if INPUT_FRAME_EDGE then             INPUT_PRESENT <= true;
            elsif INPUT_DETECT_COUNT = 0 then INPUT_PRESENT <=  false;
            else                              INPUT_PRESENT <= INPUT_PRESENT;
        end if;

    end if; --CKC
end process;
oINPUT_PRESENT <= to_std_logic(INPUT_PRESENT);
--====================================================================================================================        
        
------------------------------------ Video side of DPRAM -------------------------------------------------------------        
process (CKR) begin
    if rising_edge(CKR) then

-- Input video selector
        if SELECT_SIMM2 then SELECTED_VIDEO <= V2R; else SELECTED_VIDEO <= V1R; end if;

-- Sync separator
        DETECT_FF <=               SELECTED_VIDEO(9 downto 2) = X"FF";
        PRE_RUNIN <= DETECT_FF and SELECTED_VIDEO(9 downto 2) = X"00";
        RUNIN <= PRE_RUNIN     and SELECTED_VIDEO(9 downto 2) = X"00";
        SAV <= RUNIN and SELECTED_VIDEO(6) = '0';
        EAV <= RUNIN and SELECTED_VIDEO(6) = '1';
        
        if RUNIN then
            SEPARATED_H <= SELECTED_VIDEO(6);    -- Derive SEPARATED_H
            SEPARATED_V <= SELECTED_VIDEO(7);    -- Derive SEPARATED_V
            SEPARATED_F <= SELECTED_VIDEO(8);    -- Derive SEPARATED_F
        else
            SEPARATED_H <= SEPARATED_H;
            SEPARATED_V <= SEPARATED_V;
            SEPARATED_F <= SEPARATED_F;
        end if;

-- Horizontal count. 11 bit count along line for up to 2048 pixels. 0-1439 used
-- Bit 0 is C/Y alternating
-- Bits 4:1 are 16 SDRAM locations in a block
-- Bit 5: Block alternation
-- Bits 10:5 are up to 64 blocks per line of which only 45 are needed
        if SAV then HCOUNT <= conv_std_logic_vector(0,11);
            else     HCOUNT <= HCOUNT + 1;
        end if;

        SELECTED_VIDEO_DELAYED <= SELECTED_VIDEO;    -- Delay video to match HCOUNT when writing to DPRAM
        
-- Write video to pingpong buffers. Inferred DPRAM    2off 32x10. Address is 4 bits of word count and LSB of block count to alternate banks    
        if HCOUNT(0) = '0' then PING_PONG_RAM_C(conv_integer(HCOUNT(5 downto 1))) <= SELECTED_VIDEO_DELAYED; end if;
        if HCOUNT(0) = '1' then PING_PONG_RAM_Y(conv_integer(HCOUNT(5 downto 1))) <= SELECTED_VIDEO_DELAYED; end if;
        
-- Make VREF from both edges of SEPARATED_F. Delay edges unequally to give "natural" timing with long field 1
-- Vertical 9 bit counter
        if SAV then    
            DELAYED_SEPARATED_F <= DELAYED_SEPARATED_F(1 downto 0) & SEPARATED_F;

            VREF <= (not DELAYED_SEPARATED_F(0) and DELAYED_SEPARATED_F(1)) or (not DELAYED_SEPARATED_F(2) and DELAYED_SEPARATED_F(1));
        
            if VREF = '1' then VCOUNT <= SET_VCOUNT_START;
                else            VCOUNT <= VCOUNT + 1;
            end if;        
        end if; -- SAV

-- Request SDRAM slot for blocks 0-47. Not during V blanking        
        AREQ <= ALLOW_SDRAM_WRITES and (SEPARATED_V = '0') and HCOUNT(10 downto 5) < "110000" and HCOUNT(4 downto 0) = "11110";    -- SDRAM slot request is registered

-- First stage of address transfer at AREQ. Needs special TIMESPEC
        if AREQ then INTERMEDIATE_ADDRESS <= SDRAM_FRAME & VCOUNT & SEPARATED_F & HCOUNT(10 downto 5) & "0000"; end if;
        
    end if; --CKR
end process;
--=================================================================================================================

-------------------------------------------- SDRAM interface ------------------------------------------------------
-- SDRAM slot request. Requested for each SDRAM block, async reset by GRANT. 1st half of synchroniser.
process (CKR, GRANT) begin
    if GRANT = '1' then PRE_REQUEST <= '0';
        elsif rising_edge(CKR) then
                if AREQ then PRE_REQUEST <= '1';
            end if;
    end if; -- CKR/GRANT
end process;

SDRAM_WRITE_ENABLE_START <= DELAYED_GRANT(2);    -- Was 1

process (CKRAM) begin
    if rising_edge(CKRAM) then
    
        REQUEST <= PRE_REQUEST;    -- SDRAM slot request. 2nd half of synchroniser.
        
        DELAYED_SLOT_INCREMENT <= SLOT_INCREMENT;        -- SLOT_INCREMENT delayed by 1 is same as T=0
        GRANT <= W_SLOT and REQUEST and DELAYED_SLOT_INCREMENT;    -- Grant = REQ and T=0 and SLOT
        
        DELAYED_GRANT <= DELAYED_GRANT(6 downto 0) & GRANT;    -- GRANT is delayed to get SDRAM write pulse at correct timing
        VERY_DELAYED_GRANT <= VERY_DELAYED_GRANT(14 downto 0) & SDRAM_WRITE_ENABLE_START;    -- 16 block length (delay) defines SDRAM write pulse
        
-- SDRAM write enable
        if not ALLOW_SDRAM_WRITES               then SDRAM_WRITE_ENABLE <= '0';
            elsif SDRAM_WRITE_ENABLE_START = '1' then SDRAM_WRITE_ENABLE <= '1';    -- JK flipflop
            elsif VERY_DELAYED_GRANT(15)   = '1' then SDRAM_WRITE_ENABLE <= '0';
            else                                      SDRAM_WRITE_ENABLE <= SDRAM_WRITE_ENABLE;
        end if;

-- Second stage of address transfer at GRANT. 4 LSBs are 0 for 16 block. Registers for LSBs will be optimised out
-- Needs special TIMESPEC
        if GRANT = '1' then SDRAM_WRITE_ADDRESS <= INTERMEDIATE_ADDRESS; end if;    --

-- Read address for pingpong buffer        
        if DELAYED_GRANT(1) = '1' then
            DP_RAM_READ_ADDRESS(3 downto 0) <= "0000";    -- Reset word count
            DP_RAM_READ_ADDRESS(4) <= not HCOUNT(5);    -- Alternating blocks
        else
            DP_RAM_READ_ADDRESS(3 downto 0) <= DP_RAM_READ_ADDRESS(3 downto 0) + 1;
            DP_RAM_READ_ADDRESS(4) <= DP_RAM_READ_ADDRESS(4);
        end if;
    
-- Read DPRAM pingpong buffer. Inferred DPRAM
        SDRAM_WRITE_DATA_C <=  PING_PONG_RAM_C(conv_integer(DP_RAM_READ_ADDRESS));
        SDRAM_WRITE_DATA_Y <=  PING_PONG_RAM_Y(conv_integer(DP_RAM_READ_ADDRESS));
            
    end if; -- CKRAM        
end process;    

-- Assemble 32 bit SDRAM write data, 22 bit SDRAM write address and SDRAM write enable
oSDRAM_WRITE_DATA    <= "000000" & SDRAM_WRITE_DATA_C & "000000" & SDRAM_WRITE_DATA_Y;
oSDRAM_WRITE_ADDRESS <= SDRAM_WRITE_ADDRESS;
oSDRAM_WRITE_ENABLE <= SDRAM_WRITE_ENABLE;
--====================================================================================================================    

TP(1) <= SEPARATED_H;
TP(2) <= SEPARATED_V;
TP(3) <= SEPARATED_F;
TP(4) <= SELECTED_CLOCK;
TP(5) <= to_std_logic(SAV);
TP(6) <= to_std_logic(AREQ);
TP(7) <= to_std_logic(SELECT_SIMM2);
TP(8) <= V2R(8);
TP(9) <= SELECTED_VIDEO(8);
TP(10) <= VREF;

--TP <= (others => '0');    -- TP not used
-- to_std_logic()
end rtl;
Alternating Y/C at 27MHz is the customary way of transporting standard definition digital video. Often colloquially known as "601" after the ITU-BT601 standard. Should also have TRS codes, the digital equivalent of sync pulses. See line 175 etc of the input code for my converter.
Those who have looked inside their Aurora SCRF converters will see that Darryl used the TVP5150AM1PBS decoder. The TVP5146 is a high end device that strikes me as overkill for most applications.

http://www.ti.com/product/tvp5146m2/description
http://www.ti.com/product/tvp5150am1/description
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36